Video of this presentation from Release Engineering work week in Portland, 29 April 2014
Part 1: Back to basics
What software do we produce for mobile phones?
- Firefox for Android (Fennec)
- Firefox OS (B2G)
What environments do we use for building and testing this software?
Building | Testing | |
Fennec | CentOS 6.2 (bld-linux64-ix-*) in-house (bld-linux64-ec2-*) AWS |
Tegra / Panda / Emulator |
B2G | CentOS 6.2 | Emulator |
So first key point unveiled:
- We don’t build on tegras and pandas (we only test!)
Second key point:
- Fennec is the only product we test on tegras and pandas (we don’t test B2G on real devices)
So why do we test Fennec on tegras, pandas and emulators?
To answer this, first remember the wide variety of builds and tests we perform:
The answer is:
- We use tegras to test: Android 2.2 (Froyo)
- We use pandas to test: Android 4.0 (Ice Cream Sandwich)
- We use emulators to test: Android 2.3 (Gingerbread) and Android 4.2 (Jelly Bean)
Notice:
- We don’t test on 3.x (Honeycomb)
- We don’t test on 4.4 (KitKat)
- The versions we test on emulators are not sequencial (i.e. we test 2.3 and 4.2 on emulators – with 4.0 tested on pandas – in the middle of these two versions)
What are the main differences between our tegras and pandas?
Tegras | Pandas |
Look like this: | Look like this: |
Racked up like this: | Racked up like this: |
Older | Newer |
Running Android 2.2 | Running Android 4.0 |
Hanging in shoe racks | Racked professionally in Faraday cages |
Can only be reimaged by physically connecting them to a laptop, and pressing buttons in a magical sequence | can be remotely reimaged by mozpool (moar to come later) |
Not very reliable | Quite reliable |
Is connected to a “PDU” which allows us to programatically call an API to “pull the power” | Is connected to a “relay host” which allows us to programatically call an API to “pull the power” |
So as you see, a panda is a more serious piece of kit than a tegra. Think of a tegras as a toy.
So what are tegras and a pandas, actually?
Both are mobile device boards, as you see above, like you would get in a phone, but not actually in a phone.
So why don’t we just use real phones?
- Real phones use batteries
- Real phones have wireless network
Basically, by using the boards directly, we can:
- control the power supply (by connecting them to power units – PDUs) which we have API access to (i.e. we have an API to pull the power to a device)
- use ethernet, rather than wireless (which is more reliable, wireless signals don’t interfere with each other, less radiation, …)
OK, so we have phones (or “phone circuit boards”) wired up to our network – but how do we communicate with them?
Fennec historically ran on more platforms than just Android. It also ran on:
- Windows Mobile
- the Nokia N900 Maemo device
For this reason, it was decided to create a generic interface, which would be implemented on all supported platforms. The SUT Agent was born.
Please note: nowadays, Fennec it only available for Android 2.2+. It is not available for iOS (iPhone, iPad, iPod Touch), Windows Phone, Windows RT, Bada, Symbian, Blackberry OS, webOS or other operating systems for mobile.
Therefore, the original reason for creating a standard interface to all devices (the SUT Agent) no longer exists. It would also be possible to use a different mechanism (telnet, ssh, adb, …) to communicate with the device. However, this is not what we do.
So what is the SUT Agent, and what can it do?
The SUT Agent is a listener running on the tegra or panda, that can receive calls over its network interface, to tell it to perform tasks. You can think of it as something like an ssh daemon, in the sense that you can connect to it from a different machine, and issue commands.
How do you connect to it?
You simply telnet to the tegra or foopy, on port 20700 or 20701.
Why two ports? Are the different?
Only marginally. The original idea was that users would connect on port 20701, and that automated systems would connect on port 20700. For this reason, if you connect on port 20700, you don’t get a prompt. If you connect on port 20701, you do. However, everything else is the same. You can issue commands to both listeners.
What commands does it support?
The most important command is “help”. It displays this output, showing all available commands:
pmoore@fred:~/git/tools/sut_tools master $ telnet panda-0149 20701 Trying 10.12.128.132... Connected to panda-0149.p1.releng.scl1.mozilla.com. Escape character is '^]'. $>help run [cmdline] - start program no wait exec [env pairs] [cmdline] - start program no wait optionally pass env key=value pairs (comma separated) execcwd <dir> [env pairs] [cmdline] - start program from specified directory execsu [env pairs] [cmdline] - start program as privileged user execcwdsu <dir> [env pairs] [cmdline] - start program from specified directory as privileged user execext [su] [cwd=<dir>] [t=<timeout>] [env pairs] [cmdline] - start program with extended options kill [program name] - kill program no path killall - kill all processes started ps - list of running processes info - list of device info [os] - os version for device [id] - unique identifier for device [uptime] - uptime for device [uptimemillis] - uptime for device in milliseconds [sutuptimemillis] - uptime for SUT in milliseconds [systime] - current system time [screen] - width, height and bits per pixel for device [memory] - physical, free, available, storage memory for device [processes] - list of running processes see 'ps' alrt [on/off] - start or stop sysalert behavior disk [arg] - prints disk space info cp file1 file2 - copy file1 to file2 time file - timestamp for file hash file - generate hash for file cd directory - change cwd cat file - cat file cwd - display cwd mv file1 file2 - move file1 to file2 push filename - push file to device rm file - delete file rmdr directory - delete directory even if not empty mkdr directory - create directory dirw directory - tests whether the directory is writable isdir directory - test whether the directory exists chmod directory|file - change permissions of directory and contents (or file) to 777 stat processid - stat process dead processid - print whether the process is alive or hung mems - dump memory stats ls - print directory tmpd - print temp directory ping [hostname/ipaddr] - ping a network device unzp zipfile destdir - unzip the zipfile into the destination dir zip zipfile src - zip the source file/dir into zipfile rebt - reboot device inst /path/filename.apk - install the referenced apk file uninst packagename - uninstall the referenced package and reboot uninstall packagename - uninstall the referenced package without a reboot updt pkgname pkgfile - unpdate the referenced package clok - the current device time expressed as the number of millisecs since epoch settime date time - sets the device date and time (YYYY/MM/DD HH:MM:SS) tzset timezone - sets the device timezone format is GMTxhh:mm x = +/- or a recognized Olsen string tzget - returns the current timezone set on the device rebt - reboot device adb ip|usb - set adb to use tcp/ip on port 5555 or usb activity - print package name of top (foreground) activity quit - disconnect SUTAgent exit - close SUTAgent ver - SUTAgent version help - you're reading it $>quit quit $>Connection closed by foreign host.
Typically we use the SUT Agent to query the device, push Fennec and tests onto it, run tests, perform file system commands, execute system calls, and retrieve results and data from the device.
What is the difference between quit and exit commands?
I’m glad you asked. “quit” will terminate the session. “exit” will shut down the sut agent. You really don’t want to do this. Be very careful.
Is the SUT Agent a daemon? If it dies, will it respawn?
No, it isn’t, but yes, it will!
The SUT Agent can die, and sometimes does. However, it has a daddy, who watches over it. The Watcher is a daemon, also running on the pandas and tegras, that monitors the SUT Agent. If the SUT Agent dies, the Watcher will spawn a new SUT Agent.
Probably it would be possible to have the SUT Agent as an auto-respawning daemon – I’m not sure why it isn’t this way.
Who created the Watcher?
Legend has it, that the Watcher was created by Bob Moss.
Where is the source code for the SUT Agent and the Watcher?
The SUT Agent codebase lives in the firefox desktop source tree: http://hg.mozilla.org/mozilla-central/file/tip/build/mobile/sutagent
The Watcher code lives there too: http://hg.mozilla.org/mozilla-central/file/tip/build/mobile/sutagent/android/watcher
Does the Watcher and SUT Agent get automatically deployed when there are new changes?
No. If there are changes, they need to be manually built (no continuous integration) and manually deployed to all tegras, and a new image needs to be created for pandas in mozpool (will be explained later).
Fortunately, there are very rarely changes to either component.
Summary part 1
So we’ve learned:
- Tegras and Pandas are used for testing Fennec for Android
- They run different versions of the Android OS (2.2 vs 4.0)
- We don’t build anything on them
- Tegras are older/inferior/less reliable than pandas
- We can’t reimage tegras programmatically, but pandas we can
- There is a SUT Agent that runs on both the tegras and the pandas, and provides a mechanism to interact with it
- There is a Watcher that keeps the SUT Agent alive
- Whenever a new version of SUT Agent or Watcher is required, this needs to be manually built and rolled out to devices
Pingback: How we do automated mobile device testing at Mozilla – Part 2 | The Open Web