jhshi.github.com/tipuesearch_content.json at master · jhshi/jhshi.github.com · GitHub

1
{"pages":[{"url":"http://jhshi.me/not-found/index.html","text":"The requested item could not be located. Perhaps you might want to check the Archives ?","tags":"misc","title":"Not Found"},{"url":"http://jhshi.me/home/index.html","text":"Hello world! This is Jinghao Shi [1] (Chinese name: ). I am currently a third year PhD student in Department of Computer Science and Engineering , University at Buffalo . My advisors are Dr. Geoffrey Challen and Prof. Chunming Qiao . I also work closely with Dr. Dimitrios Koutsonikolas and Dr. Ranveer Chandra . I received my BS degree in 2011 on Computer Science at University of Science and Technology of China . Then I went to University of Hong Kong . I spent one year and a half there, before I dropped my program and came to Buffalo. You can find my CV here . Research I am interested in wireless networks (mostly Wifi) and mobile systems (mostly Android). I am excited to explore the potential of the combination of these two technologies. Publications Me on DBLP , Google Scholar . Teaching I was a TA for these courses. CSE241 Digital Systems, Fall 2013 CSE521 Introduction to Operating Systems , Spring 2014 CSE521 Introduction to Operating Systems , Spring 2015 About This Blog I started writing blogs back in 2011, when I just started using Linux (more specifically Fedora ) as my main OS [2] . I often found myself trying to remember what I did to solve a glitch a while ago, and ended up re-Googling to find out. So I figured I'd better keep some notes for my own further references. And it turns out some of the post may be somewhat helpful to others as well, so I stick along. At the beginning, I used WordPress.com , which has an awful web interface. And what drove me crazy is that I can not use Vim to edit my posts. I lived with it for more than a year and finally decided to migrate to Octopress , a static site generator. Then I switched to Pelican and now I am a much happy blogger. [1] How to Pronounce? \"Jing\" is like \"Jin\" ( note: without \"g\" ) in \"Jingle-bell\", and \"hao\" is like \"ho\" in \"hot\". [2] I ditched Fedora after I found that they roll out new releases every 6 months, and there is usualy a bunch of problem upgrading to the new release.","tags":"misc","title":""},{"url":"http://jhshi.me/productive/index.html","text":"This is a collection of tools that make me more productive, and my life much easier. Or as I would say to my friends: \"They are life-changing!\" Vim I don't know about Emacs, but other than that, Vim is THE only editor you should use (if you ever write any code)! Vimium Browse the web in a Vim way. A must-have for Vim addicts(like me). Bash VI mode Want use j , k , h , l to navigate in bash? Want Esc then dd ? You want the vi mode of bash. Check this article on how to set it. Again, a must-have for vim addicts. Tmux If you ever need to login another machine to do some stuff, then you'll need tmux. What? screen ? Is it even designed for human ? Dropbox A must-have you have more then one computing device, by which I mean laptops, desktops, smart phones, tablets, and so on. It just makes file syncing so smooth and so painless. LaTeX I love typesetting using LaTeX. For a long time, it's been the reason that I always finish my homework and project reports early. Python Life is short (You need Python). Git Before Git, the only CMS system I ever used is SVN, which was awful, and it requires you to set up and SVN server to work. Thus I didn't use Now with Git, everything feels so easy and painless. And I use it to manage virtually all my projects (even this blog!) Mechanical Keyboard They make you love typing, and you just can not go back to membrane ones anymore.","tags":"misc","title":""},{"url":"http://jhshi.me/2018/02/11/opengl-over-vnc/index.html","text":"I've been using SketchUp via VMWare Player for a while and the software just hangs now and then even after fixing the OpenGL support issue . I happen to have another PC running Windows 7. But I ran into the OpenGL problem again while trying to use remote desktop. The VNC software I use is Remmina . After enabling remote desktop in my Windows 7 box, I tried to login using Remmina and open the SketchUp application. It pops up the same \"Hardware acceleration is unsupported\" error message. After Googling around, I found OpenGL does not play well over VNC <https://en.wikipedia.org/wiki/VirtualGL> . I first tried TeamViewer <https://www.teamviewer.us/> , which almost works but with one showstopper: the mouse wheel down does not work. It is used a lot in SketchUp to span or rotate viewpoints, and is something I can definitely not live without. Another route will be to add VirtualGL support to Remmina, which sounds a lot hassle. Finally, I found that if I first physically login my Windows 7 machine and open the SketchUp app, then login using Remmina from my Ubuntu machine, SketchUp remains open and will happily run without any problem. For now I can live with it: just physically open the SketchUp app once and remember do not close it. If for some reason you do not have physical access to the Windows machine, your next best bet would be TeamViewer.","tags":"linux","title":"OpenGL over VNC"},{"url":"http://jhshi.me/2018/01/23/fix-vmware-player-3d-support-issue/index.html","text":"I recently installed a Windows 10 guest OS on my Ubuntu 16.04 host machine using VMWare Workstation 12 Player , mainly to use the Sketchup Make 2017 software. Skeckup keeps complaining lack of OpenGL support. Here's how to fix it. Here are the two error messages that VMWare Player shows when starting the guest PC. No 3D support is available from the host. and Hardware graphics acceleration is not available. And when I try to open SkechUp inside Windows, it complains about lack of hardware acceleration support as well: First, I made sure my host OS (Ubuntu) does have hardware graphics support: 1 2 3 $ sudo apt-get install mesa-utils $ glxinfo | grep \"direct\" direct rendering: Yes Make sure you see the \"direct rendering: Yes\" line. Next, edit $HOME/.vmware/preferences and either add or edit this line: 1 mks.gl.allowBlacklistedDrivers = \"TRUE\" This just tells VmWare Player not to be too picky on the hardware drivers (apparently the driver on Ubuntu was blacklisted for some reason). Of course don't forget to enable 3D graphics acceleration in VMWare settings. Then close the VMWare Player, relaunch it and boot up the guest OS. Now it should not complain about hardware acceleration support and SkechUp should just run fine. Thanks to: https://askubuntu.com/questions/832755/no-3d-support-is-available-from-the-host-on-all-vmware-guests https://www.dizwell.com/wordpress/technical-articles/linux/enable-3d-graphics-for-vmware-guests/","tags":"linux","title":"Fix VMWare Player 3D Support Issue"},{"url":"http://jhshi.me/2017/09/29/google-dns-configuration-on-ubuntu-1604/index.html","text":"I recently experienced unstable DNS on my Ubuntu laptop. Here is how to configure the DNS settings so it always use the Google DNS servers first. The DNS settings was obtained as part of DHCP response. We need to configure the DHCP client on the laptop to prepend our custom DNS servers. Edit /etc/dhcp/dhclient.conf and find this line: 1 # prepend domain-name-servers Uncomment it and configure Google DNS accordingly. 1 prepend domain-name-servers 8.8.8.8, 8.8.4.4 ; Note there is a ; at the end of line. Next, restart network manager. 1 $ sudo service network-manager restart The Google DNS should now be used first.","tags":"linux","title":"Google DNS Configuration on Ubuntu 16.04"},{"url":"http://jhshi.me/2017/09/21/fix-touchpad-natual-scrolling-of-ubuntu-1604-on-thinkpad-x1/index.html","text":"I recently installed Ubuntu 16.04.3 on my Thinkpad X1 Carbon 3rd Gen laptop. However, there is no \"Natural Scrolling\" option for the touch pad. Here is how to fix it. Use this command to enable Natural Scrolling. 1 $ gsettings set org.gnome.desktop.peripherals.touchpad natural-scroll true Here is the mouse settings: there's no natural scrolling option!","tags":"linux","title":"Fix Touchpad Natual Scrolling of Ubuntu 16.04 on Thinkpad X1"},{"url":"http://jhshi.me/2017/03/06/backing-up-files-using-amazon-glacier/index.html","text":"Amazon Glacier is a cheap massive cloud storage solution that is mostly suitable for storing cold data - data that are rarely accessed. The price is fair: $4/TB/month. However, it's not like Dropbox or Googld Drive that has nice client programs that you can simply drag and drop the files to be stored. Instead, you'll have to work with their APIs to upload you files. In this post, I'll explain the basics about how to upload the files and also how to query the inventory. Basic APIs I use the boto3 API in Python. The documentation for Glacier can be found in here . I'll use the Client APIs, which simply wrap the underlying HTTP requests. In particular, these are the APIs we'll be using for basic upload and query. upload_archive : upload files. delete_archive : delete files. Note that files on Glacier is not mutable. To update a file, you'll have to delete the old one and then upload the new one. initiate_job : to download files stored in Glacier or to query inventory. describe_job : to query job status. This is asynchronous to initiate_job . Upload Files We'll use the upload_archive API to upload a file. Things to note: You need a pair of access ID and key to use the API. Follow the guide to set up boto3 correctly. You must create a \"vault\" before you can upload. You can do this in the Glacier manage console. To upload a file: 1 2 3 4 5 6 7 client = boto3 . client ( 'glacier' ) with open ( path , 'rb' ) as f : response = client . upload_archive ( vaultName = 'myvault' , archiveDescription = path , body = f ) # persist the map between response['archiveId'] and path somewhere # locally Note that the archiveDescription argument is optional, but we utilize it to store the file's local path. This will help us bookkeeping later on. Inside Glacier, the file is solely identified by the archiveId . It is advised to keep a local database of the files stored in Glacier, since the inventory is only updated every 24 hrs. Update Inventory Sometimes the local archive database may be out-of-sync with Glacier, in which case a force-sync may be necessary. Basically we'll pull the inventory of Glacier and re-build the local archive database from that. Warnings: The Glacier inventory is only updated every 24 hrs. So files uploaded within last 24 hrs may not be reflected in the inventory. The inventory query can take up to several hours to finish. The main API we will use is initiate_job , together with describe_job to query job status and get_job_output to retrieve the results once the job is finished. The same work flow can also be used to download a previously uploaded archive using the archive ID. But here we'll only show how to query the inventory. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 job req = client . initiate_job ( vaultName = 'myvault' , jobParameters = { 'Type' : 'inventory-retrieval' }) while True : status = client . describe_job ( vaultName = 'myvault' , jobId = job_req [ 'jobId' ] if status [ 'Completed' ]: break time . sleep ( 300 ) job_resp = client . get_job_output ( vaultName = 'myvault' , jobId = job_req [ 'jobId' ]) # first download the output and then parse the JSON output = job_resp [ 'body' ] . read () archive_list = json . loads ( output )[ 'ArchiveList' ] # persist archive_list","tags":"linux","title":"Backing Up Files Using Amazon Glacier"},{"url":"http://jhshi.me/2017/03/02/generating-all-in-one-latex-file-for-journal-submission/index.html","text":"Recently I need to submit to a journal that does NOT accept the final PDF but requires all LaTex sources so that they can compile the PDF on their own. Given that any LaTex projects with reasonable size would have multiple *.tex , *.bib and figure files. It'll be nice to flatten the LaTex project so that we have as few as files to upload. Here is how. latexpand There is a latexpand utility that comes with the TexLive package in Ubuntu. It does almost exactly what we need: expand latex files. I used it as follows: 1 $ latexpand --expand-bbl main.bbl main.tex -o all-in-one.tex The --expand-bbl option will replace the \\bibliography command with a list of bibitem so that the main.bib does not need to be uploaded. The all-in-one.tex will be the ONLY LaTex file we need to upload. Figures We still need to upload all the included figures. Often times we generate more figures than we actually include in the final manuscript. We can get the list of actually included figures using this command: 1 $ grep \"includegraphics\" all-in-one.tex | cut -d \\{ -f 2 | cut -d \\} -f 1 One caveat is that if all figures are put in a dedicated directory, e.g., ./figures , you'll have to use Latex's \\graphicspath command to specify the path, and use only the file name in includegraphics . In other words, \\includegraphics{./figures/abc.pdf} will not work in the all-in-one.tex unless you create that directory structure in the submission site. So instead, do this: 1 2 3 4 5 6 7 \\usepackage { graphicx } % note the path must ends with \"/\" \\graphicspath {{ ./figures/ }} \\begin { figure } \\includegraphics { abc.pdf } \\end { figure } With the above organization, this Makefile snippet will generate the all-in-one.tex file and also copy all included figures into a separate directory ( submitted ). 1 2 3 4 5 submit : @mkdir -p submitted @latexpand --expand-bbl main.bbl main.tex -o submitted/all-in-one.tex @ $( foreach fig, $( shell grep \"includegraphics\" submitted/all-in-one.tex | cut -d \\{ -f 2 | cut -d \\} -f 1 ) , /bin/cp -rfv ./figures/ $( fig ) submitted/ ; ) # note the final semi-colon in the last command The files in submitted directory is all the files you need to upload.","tags":"latex","title":"Generating All-In-One LaTex File for Journal Submission"},{"url":"http://jhshi.me/2016/11/07/ise-error-version-glibcxx_349-not-found/index.html","text":"I encountered this error while trying to run the Xilinx xlcm tool for ISE 12.2. Here is how to fix it. The Error The xlcm command runs fine until it tries to spawn a web page. Here is the error message: 1 2 3 4 5 6 /usr/bin/google-chrome-stable: /opt/Xilinx/12.2/ISE_DS/common//lib/lin64/libstdc++.so.6: version ` GLIBCXX_3.4.9 ' not found (required by /usr/bin/google-chrome-stable) /usr/bin/google-chrome-stable: /opt/Xilinx/12.2/ISE_DS/common//lib/lin64/libstdc++.so.6: version `CXXABI_1.3.5' not found ( required by /usr/bin/google-chrome-stable ) /usr/bin/google-chrome-stable: /opt/Xilinx/12.2/ISE_DS/common//lib/lin64/libstdc++.so.6: version ` GLIBCXX_3.4.10 ' not found (required by /usr/bin/google-chrome-stable) /usr/bin/google-chrome-stable: /opt/Xilinx/12.2/ISE_DS/common//lib/lin64/libstdc++.so.6: version `GLIBCXX_3.4.15' not found ( required by /usr/bin/google-chrome-stable ) /usr/bin/google-chrome-stable: /opt/Xilinx/12.2/ISE_DS/common//lib/lin64/libstdc++.so.6: version ` GLIBCXX_3.4.11 ' not found (required by /usr/bin/google-chrome-stable) /usr/bin/google-chrome-stable: /opt/Xilinx/12.2/ISE_DS/common//lib/lin64/libstdc++.so.6: version `GLIBCXX_3.4.14' not found ( required by /usr/bin/google-chrome-stable ) It appears the xlcm binary was statically configured to use the libstdc++ that are shipped with ISE. However, those libraries are too old to accommodate the modern applications such as Google Chrome. The Fix First, make sure your system's libstdc++ file has the correct libc versions. On Ubuntu: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 $ strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX GLIBCXX_3.4 GLIBCXX_3.4.1 GLIBCXX_3.4.2 GLIBCXX_3.4.3 GLIBCXX_3.4.4 GLIBCXX_3.4.5 GLIBCXX_3.4.6 GLIBCXX_3.4.7 GLIBCXX_3.4.8 GLIBCXX_3.4.9 GLIBCXX_3.4.10 GLIBCXX_3.4.11 GLIBCXX_3.4.12 GLIBCXX_3.4.13 GLIBCXX_3.4.14 GLIBCXX_3.4.15 GLIBCXX_3.4.16 GLIBCXX_3.4.17 GLIBCXX_3.4.18 GLIBCXX_3.4.19 GLIBCXX_DEBUG_MESSAGE_LENGTH That's the right libc file that xlcm should use. Now, just override the outdated libc file of ISE with this one. 1 2 3 $ cd /opt/Xilinx/12.2/ISE_DS/common/lib/lin64/ $ mv libstdc++.so.6 ise_libstdc++.so.6 $ cp /usr/lib/x86_64-linux-gnu/libstdc++.so.6 libstdc++.so.6 Use sudo where required. Now launch xlcm again and it should be able to spawn a web page successfully.","tags":"errors","title":"ISE Error: version `GLIBCXX_3.4.9' not found"},{"url":"http://jhshi.me/2016/10/24/chromecast-wireless-protocols-part-ii-cast/index.html","text":"In my previous post , I explored how the cast device find and configure the Chromecast dongle. In this post, I'll dig into the actual cast process. Cast Screen When casting screen, the cast device (Nexus 6P in my case) basically need to send a serious of screen shots (probably 30 or 60 FPS) to the chromecast, which incurs high throughput demand (double the actualy throughput required) to the Wifi network if this traffic is routed by the AP, i.e., cast device -> AP -> Chromecast. Standards such as Miracast exits for such purpose (HDMI over Wifi), but it requires a direct ad-hoc network between the two devices. While in the case of Chromecast, both the Nexus 6p and the Chromecast (at least the Nexus 6P) are still associated with the AP from my experience. So how does Chromecast do it? I captured a packet trace while setting up a screen cast session, and find one interesting packet type that I was not aware of: TDLS , which is short for Tunneled Direct Link Setup protocol. The setup packets are shown in the following screen shot. The 80:13 device is my Nexus 6P, the e8:de device is the Wifi AP and the 6c:ad device is the Chromecast dongle. We can see that the Nexus 6P first negociate the TDLS parameters via the TDLS Discovery process, then a TDLS link is setup between the Nexus 6P and the Chromecast. It is confirmed that a direct traffic between the Nexus 6P and the Chromecast happened afterwards. The same mechinism applies for casting a Chromium tab as well. Cast YouTube In another cast scenario, where the content is not originated from the cast device, but from some web server, as in YouTube, Netflix, etc. I did not observe a TDLS setup process, which makes sense since the content is not on the cast device, and the lightweight signaling packets (content URL, pause, volume) between the cast device and the Chromecast does not justify the overhead of setting up a TDLS link.","tags":"network","title":"Chromecast Wireless Protocols Part-II: Cast"},{"url":"http://jhshi.me/2016/10/24/chromecast-wireless-protocols-part-i-setup/index.html","text":"There are plenty resources online that explains how Chromecast works. But most of them focus on up-layer protocols, such as mDNS, DIAL/HTTP. I am more interested in the 802.11 MAC layer. In particular, I was curious in questions such as: What happens when you set up a Chromecast? How the cast device (such as an Android phone) and Chromecast communicate (in 802.11 layer)? Some of the questions were obvious, others are not. In this post, I will document the process about the Chromecast setup process. This will be the first of the series of posts on this topic. Hardware and Tools Hardwares: Chromecast (first gen, model: H2G2-42 ): test device TP-LINK WDR3500: AP TP-LINK WDR3500: Wifi sniffer, capture packets for analysis Nexus 6P: cast device Software Tools: OpenWRT : running on both the AP and the sniffer. Makes AP configuration and trace collection easy. tcpdump : used to collect trace Wireshark (v2.0.2): used to view trace Google Cast app on Android: used to setup Chromecast. Chromecast Setup The cast device and the chromecast dongle have to connect to the same Wifi Access Point (AP) before the cast can happen. Because the Chromecast does not have a GUI where you can configure it to connect to your Wifi network, this step is done in-directly on the cast device. The basic flow is this: The Chromecast dongle creates a Wifi network with the default SSID ChromcastXXXX , where XXXX are 4 digit number identifying the device. The Google Cast app searching for such networks and associates with it once found. You select which AP the Chromecast device should connect to, and enter credentials accordingly. The Chromecast device tries to connect the AP using the credentials provided in last step. Once the Chromecast is connected to the AP, it sets the SSID field of the beacon frames to NULL (0 in length) such that the ChromecaseXXXX SSID disappears in your phone's scan result, The Beacon Here is a snapshot of the beacon frame sent by the Chromecast device BEFORE it is configured. There are couple of interesting facts I found. The OUI of the Chromecast device ( fa:8f:ca ) is actually not registered. I can not find it anywhere ( Wireshark OUI lookup , IEEE OUI database ). I don't know how to interpret this... I tried to fool the Google Cast app by creating a fake Wifi AP with the SSID Chromecase5089 , and see if it will be list as a Chromecast in the app. The answer is: NO. Then I realized of course not, since you can name the Chromecast device whatever you want after setting it up, so SSID is not a good classifier of whether an AP is a potential Chromecast device. My second try is to fake the BSSID, especially the OUI. I set the BSSID of my test router to some value similar with the true Chromecast. It works this time. Like I guessed earlier, the SSID does not matter at all. As show in the following screenshot. The first is the true Chromecast device, while the second one is a fake. Link Setup After the cast device connects to the mini Wifi network created by the Chromecast dongle, we can instruct the Chromecast to connect to the actual AP. After filling in the AP to connect to, I observed an association request from a device with a different MAC address (OUI 6c:ad:f8 ) than the Beacon SSID in previous step. OUI lookup result show this OUI belongs to \"AzureWave Technology Inc.\", which matches its hardware spec . After Chromecast connects to the AP, the SSID field in the beacon frame was set to NULL. Given the OUI difference and the Chromecast simultaneously broadcasts beacon and associates with the AP, I suspect it actually contains two Wifi chips inside: which does not make much sense given its small form factor and low price. Or maybe the AzureWave chip supports both mode simultaneously?","tags":"network","title":"Chromecast Wireless Protocols Part-I: Setup"},{"url":"http://jhshi.me/2016/10/18/calculating-crc-for-ht-sig-in-80211n-preamble/index.html","text":"The HT-SIG field of 802.11n PLCP preamble contains a 8-bit CRC for the receiver to validate the sanity of the header. Here is how to calculate it. HT-SIG contains 48 bits and spans in 2 OFDM symbols (BPSK modulated, 1/2 rate). This diagram from the 802.11-2012 standard describes the logic to calculate the CRC for the first 34 bits of the field. Here is a Python version of the implementation. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 def calc_crc ( bits ): c = [ 1 ] * 8 for b in bits : next_c = [ 0 ] * 8 next_c [ 0 ] = b &#94; c [ 7 ] next_c [ 1 ] = b &#94; c [ 7 ] &#94; c [ 0 ] next_c [ 2 ] = b &#94; c [ 7 ] &#94; c [ 1 ] next_c [ 3 ] = c [ 2 ] next_c [ 4 ] = c [ 3 ] next_c [ 5 ] = c [ 4 ] next_c [ 6 ] = c [ 5 ] next_c [ 7 ] = c [ 6 ] c = next_c return [ 1 - b for b in c [:: - 1 ]] c is the 8-bit shift register. For each incoming bits, we calculate the next value of each bit in the register and store them in next_c . Finally, we perform two operations: reverse and two's complement. Note that c[7] is the first output bit. The standard also provides a test case. 1 2 3 4 >>> s = '1 1 1 1 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0' >>> bits = [ int ( b ) for b in s . split ()] >>> calc_crc ( bits ) [ 1 , 0 , 1 , 0 , 1 , 0 , 0 , 0 ] Translating the logic into HDLs such as Verilog is quite straightforward. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 module ht_sig_crc ( input clock , input enable , input reset , input bit , input input_strobe , output [ 7 : 0 ] crc ); reg [ 7 : 0 ] C ; genvar i ; generate for ( i = 0 ; i < 8 ; i = i + 1 ) begin : reverse assign crc [ i ] = ~ C [ 7 - i ]; end endgenerate always @( posedge clock ) begin if ( reset ) begin C <= 8'hff ; end else if ( enable ) begin if ( input_strobe ) begin C [ 0 ] <= bit &#94; C [ 7 ]; C [ 1 ] <= bit &#94; C [ 7 ] &#94; C [ 0 ]; C [ 2 ] <= bit &#94; C [ 7 ] &#94; C [ 1 ]; C [ 7 : 3 ] <= C [ 6 : 2 ]; end end end endmodule Here we use the generate block to do the bit-reverse and negation.","tags":"network","title":"Calculating CRC for HT-SIG in 802.11n Preamble"},{"url":"http://jhshi.me/2016/10/04/pythonh-error-while-installing-numpy-for-pypy-on-ubuntu-1604/index.html","text":"I encountered this error whiling install the PyPy port of Numpy on Ubuntu 16.04. Here is how to solve it. The command line I used was: 1 $ pip install git+https://bitbucket.org/pypy/numpy.git The error message was like this: 1 2 3 # a bunch of tracebacks # then this line SystemError: Cannot compile 'Python.h'. Perhaps you need to install python-dev|python-devel. The solution is actually to install the pypy-dev package instead of the python-dev package suggested by the error message. 1 2 $ sudo apt-get intall pypy-dev $ pip install git+https://bitbucket.org/pypy/numpy.git Thanks to http://askubuntu.com/a/612016/219398 .","tags":"errors","title":"Python.h Error While Installing Numpy for PyPy on Ubuntu 16.04"},{"url":"http://jhshi.me/2016/10/04/python-testing-using-pytest-tox-travis-ci-and-coverall/index.html","text":"This post explains the automatic testing setup for the WlTrace project. You can see a live demo for all tools described in this post at the WlTrace Github repo . It maybe a bit confusing at first with all those tools which serve subtly different purposes. Next, I'll first explain the purpose of the tool, and then show the particular setup in the WlTrace project. pytest: Micro Testing The Code pytest is a unit test framework that actually tests the nuts and bolts of each piece of your code. You'll need to actually write test cases and pytest will collect and run those tests for you. I favor pytest over Python's default unittest framework because of it's simplicity. Furthermore, there are many plugins exist that check various aspects of the project, such as doctest integration , pep8 format checking , etc. pytest has excellent documentation . Here is the particular configuration for the WlTrace project. 1 2 3 4 5 6 7 8 9 10 11 # setup.cfg [tool:pytest] addopts = -v -x --doctest-modules --ignore=setup.py --cov=wltrace --pep8 pep8maxlinelength = 80 pep8ignore = E402 # module level import not at top of file E241 # multiple spaces after ',' E226 # missing white space around arithmetic operator E222 # multiple spaces after operator docs/source/conf.py ALL The addopts option specific the arguments to call pytest : -v : verbose output. This shows what is being tested, instead of just a dot. -x : stop on first failed test --doctest-modules : integrate doctest --ignore=setup.py : do not do doctest on the top level setup.py --cov=wltrace : Test the coverage of the wltrace module, which is the top module of the WlTrace package. Requires pytest-cov plugin and the Coverage.py package. --pep8 : check pep8 format compliance. Requires pytest-pep8 plugin. Now we can use a single pytest command to fire up the tests and also get code coverage information during the test stored at .coverage file, which will be used later for Coveralls. This file should be ignored by your VCS. Tox: Macro Testing The Environment Tox is a tool to test your final package in various Python environments, such as Python 2, Python 3, PyPy, Jython, etc. It is mainly used to make sure the setup.py is configured properly and your final distribution package can be successfully installed under those Python environments. Furthermore, it issues the test command after installation to test if the project also run properly in those environments. Here's the project's tox.ini : 1 2 3 4 5 6 7 8 9 10 11 12 13 [tox] envlist = py27,pypy [testenv] commands = pytest deps = pytest-pep8 pytest-cov pytest coverage [testenv:pypy] install_command = pip install git+https://bitbucket.org/pypy/numpy.git {packages} Here we specified two Python environments: Python 2.7 and PyPy. Note the special install_command configuration for PyPy to deal with the Numpy package. Travis CI: Continuous Integration Travis CI is a online service to automatically build and test your project for continuous Integration, so it is easy to pinpoint to the single commit that breaks the build. Here is the .travis.yml configuration. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 language : python python : - \"2.7\" - \"pypy\" install : - | if [[ \"${TRAVIS_PYTHON_VERSION}\" = pypy ]]; then git clone https://github.com/yyuu/pyenv.git ~/.pyenv PYENV_ROOT=\"$HOME/.pyenv\" PATH=\"$PYENV_ROOT/bin:$PATH\" eval \"$(pyenv init -)\" pyenv install pypy-5.4.1 pyenv global pypy-5.4.1 pip install git+https://bitbucket.org/pypy/numpy.git fi - pip install tox-travis coveralls script : - tox after_success : - coveralls Here we first install a recent version PyPy and also tox-travis to make tox play nice with Travis CI. We also install the coveralls tool to be used later to publish the coverage information to http://coveralls.io . Note that we install numpy in pypy environment explicitly. This is because the tox-travis plugin has some difficulty utilizing the install_command configuration. Coverage.py: Code Coverage Coverage.py is a tool to measure the code coverage of Python programs. In the context of testing, it can be used to measure the code coverage when running pytest, which effectively translate code coverage to test coverage. Previously we use the pytest-cov plugin to call this tool during test so that we translate the code coverage to test coverage. The coverage information is stored in the .coverage file. Coveralls: Showcase Coverage Coveralls is a web service that displays the code coverage information generated by the Coverage.py tool. We installed coveralls package in Travis. The coveralls command will publish the coverage information to http://coveralls.io .","tags":"python","title":"Python Testing Using pytest, Tox, Travis-CI and Coveralls"},{"url":"http://jhshi.me/2016/10/03/tox-and-travis-setup-for-pypy-project/index.html","text":"Recently I was in development of a Python project that supports both the regular Python 2.7 and also the PyPy environment. Here is how to setup the automatic testing environment using Tox and Travis-CI. Tox Setup and Handling Numpy Tox is a tool to test if your package can be installed in various Python environment properly. Tox setup is relatively easy: just follow the basic setup document of Tox. The tricky part of supporting regular Python and PyPy simultaneously is to handle the numpy packet properly. since you can not simply do a pip install numpy in PyPy, but have to use the BitBucket repo URL instead. [UPDATE 2016-10-05] I found there is this PyPy port of numpy called numpy-pypy . We can also use this package. To install numpy properly, I did three things. First, have one common-requirements.txt that specific all requirements except for the numpy , and then have two separated py27-requirements.txt and pypy-requirements.txt . In py27-requirements.txt , specify the numpy version needed. 1 2 3 # requirements for py27 -r common-requirements.txt numpy == 1.11.1 While in pypy-requirements.txt , use the BitBucket link instead. .. code-block:: text # requirements for pypy git+https://bitbucket.org/pypy/numpy.git -r common-requirements.txt Second, detect Python environment at runtime and set up install_requires properly. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 install_requires = [ # requirements ] if platform . python_implementation () == 'PyPy' : install_requires . append ( 'numpy-pypy' ) else : install_requires . append ( 'numpy' ) setup ( # ... install_requires = install_requires , # ... ) At this point, you should be able to use the tox command to test both the py27 and the pypy environment. Travis Setup We need to first install the recent version of PyPy as Travis is known to behind on PyPy versions. Here is the .travis.yml that installs the PyPy v5.4.1 using pyenv . 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 language : python python : - \"2.7\" - \"pypy\" addons : apt : packages : - pypy-dev - liblapack-dev install : - | if [[ \"${TRAVIS_PYTHON_VERSION}\" = pypy ]]; then git clone https://github.com/yyuu/pyenv.git ~/.pyenv PYENV_ROOT=\"$HOME/.pyenv\" PATH=\"$PYENV_ROOT/bin:$PATH\" eval \"$(pyenv init -)\" pyenv install pypy-5.4.1 pyenv global pypy-5.4.1 fi - pip install tox-travis script : - tox The numpy-pypy package depends on pypy-dev and liblapack-dev , so we install them through the addons configuration of Travis. Note that we also used the tox-travis plugin to easy the integration of tox and Travis. Resources https://github.com/pyca/pyopenssl/blob/master/.travis.yml http://stackoverflow.com/questions/20617600/travis-special-requirements-for-each-python-version","tags":"\"python\"","title":"Tox and Travis Setup for Python Project Using PyPy"},{"url":"http://jhshi.me/2016/09/28/automated-testing-of-pelican-blog-using-travis-ci/index.html","text":"Recently I adjusted my blog work flow a bit to utilize Travis CI to automatically test and publish the site. Here is my setup process. Background I use Pelican to maintain this blog website. The source of the site content (ReStructured text, not html) is hosted at this GitHub repository: https://github.com/jhshi/blog_source . The generated site is hosted at here: https://github.com/jhshi/jhshi.github.com . My previous work flow was: Write content locally Do a bit testing using local HTTP server (mostly looking for format glitches) Commit and push to the jhshi/blog_source repo. Do a fab github to generate the site and push to the jhshi/jhshi.github.com repo. As you can see, the process (especially step 2 - 4) is a bit tedious, and the site is not throughly tested. In particular, I do not always run linkchecker before I push. And finally, I need to manually publish the site every time. So the goal is to shift some of the heavy-lifting, especially testing and publishing, to a continuous integration tool. Travis seems the natural choice given its excellent integration with Github. Before you get started, I suppose you have already connected Travis with your Github account. Refer to the documentation from Travis for more details: https://docs.travis-ci.com/user/getting-started/ . Build Environment As a first step, we first tell Travis how to build our project. Put these content in a file named \".travis.yml\" in your projects root directory. 1 2 3 4 5 language : python install : NO_SUDO=1 source setup.sh script : make html There are two things worth noticing: The setup.sh script, which can be found here: https://github.com/jhshi/blog_source/blob/master/setup.sh , is responsible to set the pelican environment, including cloning proper plugins and themes repository. The NO_SUDO flag tells the script to not use sudo and also use https URLs for repository instead of ssh . If you use ga_page_view plugin, the build will fail since the private key file ga.pem will not exist in the freshly cloned repository: it shouldn't. I'll talk about how to deal with this later. But for now, we can tell pelican to silently ignore this error. Change the ga_page_view configuration in pelicanconf.py slightly like this: 1 2 3 4 5 6 7 8 if os . path . isfile ( 'ga.pem' ): GOOGLE_SERVICE_ACCOUNT = 'xxx' GOOGLE_KEY_FILE = os . path . join ( PROJECT_ROOT , 'ga.pem' ) GA_START_DATE = '2005-01-01' GA_END_DATE = 'today' GA_METRIC = 'ga:pageviews' else : print \"[WARN] No key found for Google Analytics\" Commit and push to the master branch of your source repository, and check travis logs to make sure it can successfully build the site. Private Files There are two private files in my case: the key for Google Analytics API and the deploy key for the website repository. Let's first create the deploy key if you don't have it already. Note that this must NOT your primary ssh key. So I suggest you create a new pair of SSH keys just for the website repo. 1 $ ssh-keygen -f ./deploy_key Copy the content of deploy_key.pub to your project's deploy keys settings, then move it to somewhere else or delete it: we don't really need it anymore. Next, install travis CLI tool, which is used to encrypt the private files. travis depends on ruby greater than 2.0. So I recommend to use the Brightbox ppa for ruby. 1 2 3 4 5 6 7 8 $ sudo add-apt-repository ppa:brightbox/ruby-ng $ sudo apt-get update $ sudo apt-get install ruby2.2-dev ruby-2.2 $ sudo gem install travis $ travis -v 1.8.2 # <--- your mileage may vary $ travis login # <--- enter your GITHUB username and password Next, use travis to encrypt the file. Note that you can only encrypt one file in total. Here we have two files to encrypt: the key for Google API and the key for deploy the website. So we need to tar it first: 1 2 $ tar cvzf secrets.tgz ga.pem deploy_key $ travis encrypt-file secrets.tgz Finally, unpack those secrets by add these lines to the .travis.yml . 1 2 3 before_install : - openssl aes-256-cbc -K $encrypted_XXXXXX_key -iv $encrypted_XXXXXX_iv -in secrets.tgz.enc -out secrets.tgz -d - tar xvf secrets.tgz Replace XXXXXX with the magic number you got from travis encrypt-file . Now, delete secrets.tgz , and add deploy_key and ga.pem to your .gitignore , commit all changes, and push to master branch. Check travis logs to make sure the sites gets built. In particular, you should not see the No key warning for ga_page_view plugin. Automatic Deployment Finally, let's tell travis to deploy the website if the test passes. Add this rule to your Makefile : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 check : clean publish cd $( OUTPUTDIR ) && $( PY ) -m pelican.server && cd - & sleep 3 linkchecker http://localhost:8000 pgrep -f \"&#94;python -m pelican.server\" | xargs kill -9 travis : clean check publish chmod 600 deploy_key && \\ eval ` ssh-agent -s ` && \\ ssh-add deploy_key && \\ cd $( OUTPUTDIR ) && \\ git init . && \\ git config user.email \"robot@travis-ci.org\" && \\ git config user.name \"Travis\" && \\ git remote add origin $( GITHUB_PAGES_REPO ) && \\ git add --all --force . && \\ git commit -am \"Site updated on `date -R`\" && \\ git push origin master --force && \\ cd - Note that we also check URL links using linkchecker before deploying. Change the make target in .travis.yml to make travis , commit and push. And now if the site is good, it should be automatically deployed.","tags":"pelican","title":"Automated Testing of Pelican Blog Using Travis-CI"},{"url":"http://jhshi.me/2016/08/24/start-autossh-on-boot/index.html","text":"AutoSSH is a great tool to maintain a persistent SSH tunnel. Here is how to start AutoSSH on boot so that the tunnel can survive system reboot of the local machine. Here we assume you can already ssh into the remote machine without typing the password. If not, see my previous post on how to set up it. First, on your local machine, switch to root user: 1 $ sudo su Second, ssh into remote machine as root so the remote machine is added to your root 's known_hosts . 1 # ssh <user>@<remote_host> Third, add this line to your /etc/rc.local . 1 autossh -N -f -i /home/<user>/.ssh/id_rsa -R 22222:localhost:22 <user>@<remote_host> The command arguments are: -N : tell ssh to not execute any command, since we only use it for tunneling. -f : tell autossh to fall into background on start. -i : tell ssh to use the proper identity. -R 22222:localhost:22 : reverse tunnel remote host's 22222 port to localhost's 22 port. So that we can use ssh -p 22222 localhost on remote host to ssh into local machine.","tags":"linux","title":"Start AutoSSH on Boot"},{"url":"http://jhshi.me/2016/08/14/disable-new-sign-in-from-email-from-google/index.html","text":"I have a Gmail account for Android ROM testing purpose. And I kept receiving this annoying email from Google anytime I use that account to sign in Gmail (most of the time it's a clean flashed Android phone which I will wipe out later). Here is how to disable the email notification. Login to Gmail using the account which you wish to stop receiving notifications. At the very bottom right of Inbox page, you will see this: 1 2 Last account activity: 0 minutes ago Details Click on \"Details\", then at the bottom of the pop up window: 1 Alert preference: Show an alert for unusual activity. change Click \"change\" and then choose \"Never show an alert for unusual activity\". Then \"Apply\", then \"Disable alerts\". Thanks to: https://webapps.stackexchange.com/questions/85727/make-google-stop-sending-new-sign-in-from-emails/86432#86432 ?s=77a2ae5697034627aab8f8eb0388f4ce","tags":"misc","title":"Disable \"New sign-in from\" Email from Google"},{"url":"http://jhshi.me/2016/08/10/start-gerrit-server-upon-boot-on-ubuntu/index.html","text":"I couldn't find any solid document online. Here are the steps to configure Gerrit server to automatically start upon system boot on Ubuntu. Here are the version numbers on my setup, your mileage may vary. Ubuntu 1 2 3 4 5 6 7 8 $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 14.04.1 LTS Release: 14.04 Codename: trusty $ uname -a Linux platform 3.13.0-39-generic #66-Ubuntu SMP Tue Oct 28 13:30:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Gerrit 2.9.1, site directory is /srv/gerrit2 . First, modify the gerrit.sh script, change: 1 # Required-Start: $named $remote $syslog To this line: 1 # Required-Start: $all Note the line is still commented. Second, make the proper symlinks: 1 2 $ sudo ln -sfv /srv/gerrit2/bin/gerrit.sh /etc/init.d/gerrit $ sudo update-rc.d gerrit defaults 92 Finally, we need to tell Gerrit the site directory. 1 $ echo \"GERRIT_SITE=/srv/gerrti2\" > /etc/default/gerritcodereview You should be good to go. Restart the server and Gerrit should be up and running as well. Thanks to this post: http://askubuntu.com/questions/721478/ubuntu-init-d-configuration-not-starting-gerrit-2-11-4-at-boot","tags":"linux","title":"Start Gerrit Server upon Boot on Ubuntu"},{"url":"http://jhshi.me/2016/08/06/filenotfound-exception-in-recoverysysteminstallpackage/index.html","text":"I encountered this error while trying to use the RecoverySystem.installPackage API to apply an OTA update. This post shows what causes the error and how to walk around it. The error occurs for AOSP release android-6.0.1_r24 in my case. The Cause By looking at the source code at $AOSP/frameworks/base/core/java/android/os/RecoverySystem.java (Link RecoverySystem.java ). The exception was thrown from this line: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 public static void installPackage ( Context context , File packageFile ) throws IOException { String filename = packageFile . getCanonicalPath (); FileWriter uncryptFile = new FileWriter ( UNCRYPT_FILE ); // <-- THIS LINE try { uncryptFile . write ( filename + \"\\n\" ); } finally { uncryptFile . close (); } Log . w ( TAG , \"!!! REBOOTING TO INSTALL \" + filename + \" !!!\" ); // If the package is on the /data partition, write the block map file // into COMMAND_FILE instead. if ( filename . startsWith ( \"/data/\" )) { filename = \"@/cache/recovery/block.map\" ; } final String filenameArg = \"--update_package=\" + filename ; final String localeArg = \"--locale=\" + Locale . getDefault (). toString (); bootCommand ( context , filenameArg , localeArg ); } Earlier, UNCRYPT_FILE was defined as this: 1 2 3 private static File RECOVERY_DIR = new File ( \"/cache/recovery\" ); private static File COMMAND_FILE = new File ( RECOVERY_DIR , \"command\" ); private static File UNCRYPT_FILE = new File ( RECOVERY_DIR , \"uncrypt_file\" ); So its value is /cache/recovery/uncrypt_file . However, at this point, its parent directory /cache/recovery/ does not exists yet! And FileWriter will not create it automatically, therefore the FileNotFound exception. The Fix If you are a app developer and do not have access to the AOSP framework on your target system, simple create that folder before calling installPackage . 1 2 ( new File ( \"/cache/recovery/\" )). mkdirs (); RecoverySystem . installPackage ( otaPackage ); The real fix from the platform side is this: 1 2 3 4 5 6 7 public static void installPackage ( Context context , File packageFile ) throws IOException { String filename = packageFile . getCanonicalPath (); RECOVERY_DIR . mkdirs (); FileWriter uncryptFile = new FileWriter ( UNCRYPT_FILE ); // <-- THIS LINE //... }","tags":"errors","title":"FileNotFound Exception in RecoverySystem.installPackage"},{"url":"http://jhshi.me/2016/08/06/how-to-properly-mirror-cyanogenmod/index.html","text":"Recently I needed to create a mirror of CyanogenMod to facilities further development of our smartphone testbed PhoneLab. The goal is to have a working mirror that we can stage any experimental changes on our own server, since there is no plan to publish such changes to upstream (at least for now). Quite surprisingly, I found this to be a non-trivial task. Here is a log of I walked around the minefield. Background and Goal We are using Gerrit as a Git server. It's a decent Git host solution has some nice access control features. We have built a set of tools that can automatically merge a given set of experimental branches to our baseline branch, and generates incremental OTA updates that we can push to our participants. Starting from summer 2016, we are using the Nexus 6 device (code name shamu ). We have been using the stock AOSP mirrors since the very beginning. But this year we decided to give CyanogenMod a try (particularly because it is a huge pain to even get a working ROM for Nexus 6 using stock AOSP). But we still want the automated process of merging experimental changes and do OTA updates. The goals of our mirror are: A simple repo init/repo sync using our manifest should give you a working code-base, meaning you can build a working ROM for Nexus 6 with it. No any special tweaks needed on the experimenter's side. Each repo should have a common baseline branch (called phonelab/cm-13.0/develop that somebody can fork from and start making experimental changes. The Overall Picture Produce a local clone that are suitable to server as a mirror Push this local clone to our Gerrit server Compose a proper repo manifest to point things to the right place. Get a Working Mirror Repo We have chosen the latest stable CyanogenMod release for Nexus 6 ( stable/cm-13.0-ZNH2K ). The first trap is that: the default manifest does not work if you want to create a mirror. In particular, CM has used shallow clones ( clone-depth=\"1\" ) for certain repos. This is OK if you do not intent to ever push the repo, but most likely Gerrit will complain about this and eventually claim that these repos are corrupted because the history is not complete. So the first step is to fork the CyanogenMod manifest (mine is here: https://github.com/jhshi/android , check the stable/cm-13.0-ZNH2K branch) and remove any shallow clones. This can be done via this VIM command: 1 %s: /clone-depth=\"1\" / / g Also, since we are using a different manfiest repos, we also set the default fetch URL to be an absolute path: 1 2 3 < remote name = \"github\" fetch = \"https://github.com/\" review = \"review.cyanogenmod.org\" / > Now, do a repo init using this manifest. 1 2 $ repo init -u https://github.com/jhshi/android -b stable/cm-13.0-ZNH2K $ repo sync This will download all repositories properly. After this finished, we need to also grab the repos for our specific device. 1 2 $ source build/envsetup.sh $ breakfast shamu This will grab two extra repos: device/moto/shamu kernel/moto/shamu The next step is to create a common baseline branch based on the current tip. 1 $ repo forall -ec 'echo $REPO_PATH; git checkout -b phonelab/cm-13.0/develop' Then, we create the corresponding repositories on the Gerrit server. Here is the second trap. In the CM manifest, there are several these projects: 1 2 3 4 5 6 7 8 <project path= \"hardware/qcom/audio-caf/apq8084\" name= \"CyanogenMod/android_hardware_qcom_audio\" groups= \"qcom,qcom_audio\" revision= \"stable/cm-13.0-caf-8084-ZNH2K\" /> <project path= \"hardware/qcom/audio-caf/msm8916\" name= \"CyanogenMod/android_hardware_qcom_audio\" groups= \"qcom,qcom_audio\" revision= \"stable/cm-13.0-caf-8916-ZNH2K\" /> <project path= \"hardware/qcom/audio-caf/msm8937\" name= \"CyanogenMod/android_hardware_qcom_audio\" groups= \"qcom,qcom_audio\" revision= \"stable/cm-13.0-caf-8937-ZNH2K\" /> <project path= \"hardware/qcom/audio-caf/msm8952\" name= \"CyanogenMod/android_hardware_qcom_audio\" groups= \"qcom,qcom_audio\" revision= \"stable/cm-13.0-caf-8952-ZNH2K\" /> <project path= \"hardware/qcom/audio-caf/msm8960\" name= \"CyanogenMod/android_hardware_qcom_audio\" groups= \"qcom,qcom_audio\" revision= \"stable/cm-13.0-caf-8960-ZNH2K\" /> <project path= \"hardware/qcom/audio-caf/msm8974\" name= \"CyanogenMod/android_hardware_qcom_audio\" groups= \"qcom,qcom_audio\" revision= \"stable/cm-13.0-caf-8974-ZNH2K\" /> <project path= \"hardware/qcom/audio-caf/msm8994\" name= \"CyanogenMod/android_hardware_qcom_audio\" groups= \"qcom,qcom_audio\" revision= \"stable/cm-13.0-caf-8994-ZNH2K\" /> <project path= \"hardware/qcom/audio-caf/msm8996\" name= \"CyanogenMod/android_hardware_qcom_audio\" groups= \"qcom,qcom_audio\" revision= \"stable/cm-13.0-caf-8996-ZNH2K\" /> As you can see, they are in fact from the same remove repositories, just with different revision name, and they are expected in different folders. Since we want a common branch name for each repository, we have to create multiple repositories on our server, so that we can let the same name phonelab/cm-13.0/develop to point to different commits. The key point here is: name is no longer a unique key to identify a project, but path are. So we will name the repos by their path , not by name . 1 $ repo forall -ec 'echo $REPO_PATH && ssh -p 29418 user@server gerrit create-project cm-shamu/$REPO_PATH' Note that I am using $REPO_PATH , which is the local filesystem folder name, rather than $REPO_NAME . Also, all such repos are under the cm-shamu/ name space on our server. Next, upload all those repos: 1 $ repo forall -ec 'echo $REPO_PATH && git push user@server:29418/cm-shamu/$REPO_PATH refs/heads/*' This shall take a while. Get a Working Manifest Now all repos are in our Gerrit server, we need to compose a proper manifest for repo init . Start with the default manifest at https://github.com/CyanogenMod/android . We made these changes: There should be only one remote called phonelab , which points to our Gerrit server. The default revision of every project should be phonelab/cm-13.0/develop . Remove any individual revision or remote project attributes. This can be done by this VIM command: :%s/revision=\".\\{-}\" //g and :%s/remote=\".\\{-}\" //g , where .\\{-} is VIM's non-greedy regex syntax. Remove all name attribute, since the name will be the path: :%s/name=\".\\{-}\" //g . Change path=.. to be name=.. : :%s/path=/name=/g . Change default fetch URL to be . since the manifest and all other repos are now in the same level. I ended with this manifest: https://github.com/jhshi/cm.manifest.shamu/blob/phonelab/cm-13.0/develop/default.xml Create a Gerrit project with the path cm-shamu/manifest.git and push the modified manifest to it. At this point, anybody should be able to do a single repo init using the above manifest, and all repos will be pulled from our Gerrit server.","tags":"android","title":"How to Properly Mirror CyanogenMod"},{"url":"http://jhshi.me/2016/08/01/xilinx-ise-internal_errorxstcmainc3483156161/index.html","text":"I encountered this error a lot recently when trying to compile a customized Verilog project for USRP N210 using Xilinx ISE 12.2. Here is one reason why this error might happen from my experience. It seems ISE does not like it when you use indexed array items in module instance port. For example: 1 2 3 4 5 6 7 8 reg [ 31 : 0 ] ram [ 0 : 127 ]; reg [ 6 : 0 ] addr ; some_module m_inst ( . clock ( clock ), . input ( ram [ addr ]), // other ports ); This will mostly probably cause the INTERNAL_ERROR . The walk around is to use a dedicated wire for the port instead of a indexed array item. 1 2 3 4 5 6 7 8 9 reg [ 31 : 0 ] ram [ 0 : 127 ]; reg [ 6 : 0 ] addr ; wire [ 31 : 0 ] the_input = ram [ addr ]; some_module m_inst ( . clock ( clock ), . input ( the_input ), // other ports ); Apparently this is just one of the possible reasons that could cause the error, but definitely worth checking out if you're desperate after exhausting out other possibilities.","tags":"errors","title":"Xilinx ISE INTERNAL_ERROR:Xst:cmain.c:3483:1.56.16.1"},{"url":"http://jhshi.me/2016/07/12/setting-up-usrpn2x0-in-virtualbox/index.html","text":"This post shows how to connect to USRP N2x0 from a Ubuntu guest OS inside Virtualbox running on a Windows host. Host OS Here we assume you have a secondary Ethernet card that is physically connected to the USRP N2x0. First, in VirtualBox's configuration window, click the Network tab, and then Adpater 2 . Leave Adapter 1 alone so that you still have Internet access inside the VM. In Attached to , choose Bridged Adapter so that the VM has direct access to the physical network adapter. Then in Name , choose the secondary NIC that is physically connected to the USRP. Save the configuration and boot into the guest OS. Guest OS Inside the guest Ubuntu OS, make sure the network adaptor is visible. $ ifconfig -a You should see two Ethernet interfaces: one for the Adapter 1 which provides Internet access through NAT, and another for the Adapter 2 we just added. At this point, there should be no IP address assigned to the second interface. In my case, the two interfaces are enp0s3 (NAT) and enp0s8 (Bridged). Adapt the names accordingly for your setup in following instructions. Next, assign an static IP address to enp0s8 . The default IP address for USRPs are usually 192.168.10.2, so we set the IP address for enp0s8 to be in the same subnet. $ sudo ifconfig enp0s8 192.168.10.1 Then, add a static route so that all packets in 192.168.10.0/24 subnet is routed via the enp0s8 interface. $ sudo ip route add 192.168.10.0/24 dev enp0s8 Make sure that the newly added route is recognized by the kernel (the last line of the following output). $ ip route default via 10.0.2.2 dev enp0s3 proto static metric 100 10.0.2.0/24 dev enp0s3 proto kernel scope link src 10.0.2.15 metric 100 192.168.10.0/24 dev enp0s8 proto kernel scope link src 192.168.10.1 Finally, bring up the interface and ping the USRP! $ sudo ifconfig enp0s8 up $ ping 192.168.10.2 $ uhd_find_devices You should receive ping responses and uhd_find_devices should be able to find the USRP devices.","tags":"research","title":"Setting Up USRPN2x0 in VirtualBox"},{"url":"http://jhshi.me/2016/07/12/mounting-virtualbox-shared-folder-on-boot/index.html","text":"Shared folder is a great feature of Virtualbox to share data between the host and guest OS. This post shows how to mount the shared folder during boot on a Ubuntu 16.04 guest OS running on Windows 10 host OS. Setup the Shared Folder Suppose you want to share D:\\vbox_share on the host OS to the guest OS. First, open up the virtual machine configuration window in Virtualbox, in Shared Folders tab, click the add button, shown as follows. In the pop up windows, enter D:\\vbox_share in the Folder Path box -- this is the path to the folder on the host OS. In Folder Name box, give an alias to that folder which we will use later in guest OS. Here I use vbox_share as an example. Optionally, check the Auto-mount and Make Permarnent box as you wish. Then, boot into your Ubuntu virtual machine, open a terminal, and use this command to mount the shared folder. Here I assume the mount point is /mnt/vbox_share . $ sudo mount -t vboxsf vbox_share /mnt/vboxshare Mount on Boot To mount the shared folder during boot, we need to do two things. First, add this entry to /etc/fstab vbox_share /mnt/vbox_share vboxsf defaults 0 0 Second, add this entry to /etc/modules . vboxsf This is because the vboxsf module will not be loaded by Linux by default during boot. Therefore the mounting will fail. Then entry in /etc/modules tells Linux to load the module first before trying to mount the shared folder. Thanks to this post: http://askubuntu.com/questions/365346/virtualbox-shared-folder-mount-from-fstab-fails-works-once-bootup-is-complete","tags":"linux","title":"Mounting Virtualbox Shared Folder on Boot"},{"url":"http://jhshi.me/2016/07/08/customize-usrp-n2x0-dsp-rx-chain/index.html","text":"In one of my research projects, I need to develop some signal processing logic that runs on the FPGA of the USRP, to meeting some delay and timing requirements. Here is a general overview of the steps to customize the DSP Rx chain of USRP N2x0 devices. Details of my particular customization will be probably discussed in separate posts. DSP Rx Chain Overview The spirit of Software Defined Radios is to push as much as of the signal processing to the host PC, enabling easy development and fast prototyping and easy. However, there are certain tasks that are just too timing/performance sensitive to be put on the host PC. Therefore, USRPs has a FPGA on board to perform several pre-processing of the signal samples before they are sent to the host PC. The above diagram shows the pipeline of the DSP receive chain on the FPGA. The RF signal are sampled and converted by the ADC module, and the raw samples are processed by the Rx Frontend module for scaling and converting the samples to the familiar I/Q values. Next, the I/Q values ( frontend_i, frontend_q ) are sent to a dummpy custom module. By default, this Custom module just pipe the I/Q samples directly to the Digital Down Converter (DDC) module to extract the baseband signal. Then the baseband signals ( ddc_out_sample ) are passed into the Custom module again, which in turn pass them directly to the VITA49 core, which will frame these samples and send them to host PC. As we can clear see, the USRP FPGA framework already provides us a nice custom valve where we can perform custom processing either before or after the DDC stage. Enable Custom Build By default, the Custom module is disabled. To enable it, we need to make the following changes. Here we use the N200R4 as an example. The steps are the same for other N2x0 devices. First, clone the USRP FPGA source (if you haven't done so). $ git clone git@github.com:EttusResearch/fpga.git $ cd fpga/usrp2/top/N2x0/ Next, make a copy the original Makefile. $ cp Makefile.N200R4 Makefile.N200R4.custom Then make these changes to Makefile.N200R4.custom . # use a differnet build directory - BUILD_DIR = $(abspath build$(ISE)-N200R4) + BUILD_DIR = $(abspath build$(ISE)-N200R4-custom) # remove these two lines, as we will set them later - CUSTOM_SRCS = - CUSTOM_DEFS = # include a custom src list file that we will create later, this file sets # the CUSTOM_SRCS variable + include ../../custom/Makefile.srcs # Enable the custom module using verilog macro -\"Verilog Macros\" \"LVDS=1|RX_DSP0_MODULE=custom_dsp_rx\" +\"Verilog Macros\" \"LVDS=1|RX_DSP0_MODULE=custom_dsp_rx|RX_DSP1_MODULE=custom_dsp_rx\" Then in fpga/custom/ directory, create a file named Makefile.srcs . CUSTOM_SRCS = $( abspath $( addprefix $( BASE_DIR ) /../custom/, \\ custom_dsp_rx.v \\ )) As you continue the development, you'll probably create more Verilog modules. Just add their file names here. Checkpoint Now the build system will include the custom_dsp_rx.v file. Before you do any changes to that file, I suggest you compile the whole project for sanity check. $ cd fpga/top/N2x0/ $ make -f Makefile.N200R4.custom clean bin $ uhd_image_loader --args = \"type=usrp2\" --fpga-path build-custom/u2plus.bin --fw-path path/to/your/firmware This should succeed and the functionality of the FPAG image should be exactly the same as before since the default Custom module only passes through signals. Note: to flash the FPGA image, you'll also need a compatible firmware image. So I recommend you to clone the uhd and fpga repos and build them together.","tags":"research","title":"Customize USRP N2x0 DSP RX Chain"},{"url":"http://jhshi.me/2016/07/08/installing-tmux-from-source-non-root/index.html","text":"Recently I needed to install tmux on a server which runs some ancient RHEL and I do not have sudo access to. Here is how I did it. In fact it has tmux 1.6 pre-installed, but my tmux configuration file is based on tmux 2.2, which contains many options that are absent in earlier versions of tmux. Libevent Setup libevent will be most likely missing (as it is in my case). So first, let's set it up. Download the tarball from http://libevent.org/ , extract it, configure and install. $ wget https://github.com/libevent/libevent/releases/download/release-2.0.22-stable/libevent-2.0.22-stable.tar.gz $ tar xvf libevent-2.0.22-stable.tar.gz $ cd libevent-2.0.22-stable $ ./configure --prefix = $HOME $ make # use make -j 8 to speed it up if your machine is capable $ make install Note that since I do not have root access to this machine, I set the installation prefix to be my home directory. Tmux Setup Second, download the tmux source tarball from https://tmux.github.io/ . As of writing this blog, the latest version is 2.2. $ wget https://github.com/tmux/tmux/releases/download/2.2/tmux-2.2.tar.gz $ tar xvf tmux-2.2 $ ./configure --prefix = $HOME CFLAGS = \"-I $HOME /include\" LDFLAGS = \"-L $HOME /lib\" $ make $ make install Again, I set the installation prefix to be my home directory, and also tells gcc where to find the libevent headers and libraries using the CCFLAGS and LDFLAGS . $PATH Setup After this, tmux binary will be installed in $HOME/bin . Finally, we need to tweak the $PATH variable a bit so that bash will find this binary before the system one. $ export PATH = $HOME /bin: $PATH You may want to also put the above line in your .bashrc . Now you should be able to use the shiningly new tmux. $ tmux -V tmux 2.2","tags":"linux","title":"Installing Tmux from Source (Non-Root)"},{"url":"http://jhshi.me/2016/03/09/zathura-pdf-viewer-for-vim-lovers/index.html","text":"I have been looking for a PDF viewer on Linux platform that is lightweight, keyboard driven. Evince was once my favorite, until I met Zathura . Highlights Here a few features of Zathura that I really appreciate: Keyboard Driven : the keyboard shortcuts are very similar to Vim. A Vim user will immediately feel at home when using Zathura. Minimal Design : but has almost all features you would expect from any decent PDF viewers. In particular, automatically reload the file if changes are detected. This comes handy together with the continuous preview mode of Latexmk . Customizibility : similar to Vim, there is a zathurarc which you can use to customize Zathura. Installation Use this command to install Zathura and set it to default PDF viewer. 1 2 3 $ sudo apt-get install zathura $ mimeopen -d *.pdf # choose Zathura Basic Usage As I mentioned before, Zathura use almost the exact key mapping as Vim. For example, j, k, h, l for navigation, gg, GG to go to the first or last page, and J, K for next and previous page. These are pretty much all you need for basic PDF viewing. In particular, TAB will show the table of content. For more keyboard shortcuts, checkout the manual. Configuration You can configure Zathura using $HOME/.config/zathura/zathurarc . Here is my zathurarc . 1 2 3 4 5 6 7 8 9 10 11 12 13 # zoom and scroll step size set zoom-step 20 set scroll-step 80 # copy selection to system clipboard set selection-clipboard clipboard # enable incremental search set incremental-search true # zoom map <C-i> zoom in map <C-o> zoom out","tags":"linux","title":"Zathura: PDF Viewer for VIM Lovers"},{"url":"http://jhshi.me/2015/12/27/handle-keyboardinterrupt-in-python-multiprocessing/index.html","text":"multiprocessing is a convenient library to take the advantage of multiple cores easily found in modern processes. The typical pattern is to spawn a bunch of worker processes, and let them consume the data from a queue . However, when debugging, I usually found myself attempting to terminate the script using Ctrl-C yet to find it has no effect. Working Example Here is a typical pattern to use multiprocssing.Process and multiprocess.Queue . 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 import multiprocessing class Worker ( multiprocssing . Process ): def __init__ ( self , queue , * args , ** kwargs ): self . queue = queue def run ( self ): while True : item = self . queue . get () if item is None : break # process item here queue = multiprocess . Queue () workers = [ Worker ( queue ) for _ in range ( multiprocssing . cpu_count ())] for w in workers : w . start () queue . put ([ i for i in items ]) for w in workers : queue . put ( None ) queue . close () queue . join_thread () Here we spawn a number of workers, and let each of them consume input from the queue. Normally the main process gets stuck at the queue.join_thread() function. When you press Ctrl-C while the script is running, the subprocesses will not be terminated properly. First Attempt My first try is to catch the KeyboardInterrupt and the manually terminate the processes. 1 2 3 4 5 try : queue . join_thread () except KeyboardInterrupt : for w in workers : w . terminate () However, this won't work most of the time, especially when you have some serious computing going on in each process. Solution Then I noticed the daemon flag in the Process document. When a process exits, it attempts to terminate all of its daemonic child processes. So I set each child process's daemon attribute to be True : they are not creating sub-subprocesses anyway. Note that daemon flag must be set BEFORE calling the processes' start function. Also, once the daemon flag is set, queue.join_thread() does not work anymore: you'll have to call join for each worker. 1 2 3 4 5 6 7 8 9 10 for w in workers : w . daemon = True w . start () try : for w in workers : w . join () except KeyboardInterrupt : for w in workers : w . terminate ()","tags":"python","title":"Handle KeyboardInterrupt in Python MultiProcessing"},{"url":"http://jhshi.me/2015/12/15/fix-screen-brightness-issue-with-thinkpad-x1-carbon-3rd/index.html","text":"I recently installed Ubuntu 14.04.3 LTS on my Thinkpad X1 carbon (3rd Edition). Most of the stuff worked out of the box, yet the screen brightness adjustment key has no effect. After Googling around, this solution works. Create a file named /usr/share/X11/xorg.conf.d/20-intel.conf , with the following content. 1 2 3 4 5 6 Section \"Device\" Identifier \"card0\" Driver \"intel\" Option \"Backlight\" \"intel_backlight\" BusID \"PCI:0:2:0\" EndSection Thanks to this post. http://itsfoss.com/fix-brightness-ubuntu-1310/ In fact, this issue is probably not related to this particular hardware. I found another thread in askubuntu.com . Yet the solution there did not work for me.","tags":"linux","title":"Fix Screen Brightness Issue with ThinkPad X1 Carbon 3rd Edition"},{"url":"http://jhshi.me/2015/12/06/troubleshooting-slow-vim-scrolling/index.html","text":"I was editing a LaTeX file using VIM and noticed that the scrolling is quite slow. Here is how I troubleshoot it. The Symptom I made a small screen cast showing what I mean by \"slow\". I opened the file, moved the cursor to the first line, and then pressed and held j , until the cursor reached the end of the file. You can easily notice the lag starting from around line 80. First Attempt: Plugins At first, one could easily blame certain plugins (especially I recently installed YouCompleteMe ). I used Vundle to manage my plugins so it is relatively easy to disable them. But this ends up nowhere: even after I disabled all plugins, the problem still persists. Troubleshooting I found an excellent guide from VIM Wiki about how to troubleshoot VIM problems. Here are what I tried: Run VIM without any customization---OK, no scrolling issue. 1 $ vim -N -u NONE -i NONE main.tex Only load my .vimrc ---no luck, still sluggish. 1 $ vim -N --noplugin -i NONE main.tex Binary searching issue inside .vimrc using the finish command, which tells VIM to stop loading further commands. Finally, I was able to pinpoint this line inside my .vimrc : 1 set cursorline Then I did a :h cursorline and found these: Highlight the screen line of the cursor with CursorLine |hl-CursorLine|. Useful to easily spot the cursor. Will make screen redrawing slower . I suspect because I do have several large chunks (16 by 16) of tabular environment inside the file, but it is still surprising that a modern computer can not handle such seemingly simple text editing. I Googled online and did not found any useful solutions. I guess for now I will just have to live with it. Fortunately I do not have many such LaTeX files.","tags":"vim","title":"Troubleshooting Slow VIM Scrolling"},{"url":"http://jhshi.me/2015/12/05/run-plex-media-server-as-another-user-in-ubuntu/index.html","text":"Recently I installed Plex Media Server on my Ubuntu box. Here are what I did to make it run as my user so that it can index any of my media files without permission issues. The instructions here is for Ubuntu 14.04, but should be applicable to later Ubuntu version as well. First, we need to tell Plex the user name it should run as. In /etc/default/plexmediaserver , change the PLEX_MEDIA_SERVER_USER variable to the user name you want to run Plex as. Second, we need to change the owner of Plex's App support directory. By default, it's /var/lib/plexmediaserver . If in doubt, you can check the PLEX_MEDIA_SERVER_APPLICATION_SUPPORT_DIR variable in /etc/systemd/system/plexmediaserver.service . 1 $ sudo chown -R user:user /var/lib/plexmediaserver Finally, restart Plex server: 1 2 $ sudo service plexmediaserver stop $ sudo service plexmediaserver start Now make sure that the server is running as the user you specified: 1 $ ps aux | grep \"plex\"","tags":"linux","title":"Run Plex Media Server as Another User in Ubuntu"},{"url":"http://jhshi.me/2015/11/14/acpi-error-method-parseexecution-failed-_gpe_l6f/index.html","text":"I recently built a PC based on Intel's latest Skylake CPU ( i5-6500 ) and Z170 chipset ( AsRock Z170 Pro4 ), and installed Ubuntu 15.10 on it. After setting up, however, I found that the kernel message buffer was flooded with this error message. This is how I fixed it. TL;DR This is how to fix it: add this line to /etc/rc.local : 1 echo \"disable\" > /sys/firmware/acpi/interrupts/gpe6F Then reboot your PC, the error message should be gone. If you want to learn more about how I came up with fix, keep reading. The Symptoms As described earlier, after the system booted up, this error message flooded the kernel ring buffer: 1 2 3 [ 0.922778 ] ACPI Exception: AE_NOT_FOUND, while evaluating GPE method [ _L6F ] ( 20150619/evgpe-592 ) [ 0.923906 ] ACPI Error: [ PGRT ] Namespace lookup failure, AE_NOT_FOUND ( 20150619/psargs-359 ) [ 0.923908 ] ACPI Error: Method parse/execution failed [ \\_ GPE._L6F ] ( Node ffff8804654cd118 ) , AE_NOT_FOUND ( 20150619/psparse-536 ) These three error messages were printed over and over again, and the /var/log/kern.log file rapidly exceeds several GB in about half an hour. Attempts After Googling around, I found this kernel bug report that is exactly the same problem: https://bugzilla.kernel.org/show_bug.cgi?id=105491 It seems that adding acpi=off to kernel arguments could eliminate this error, but that will also disable all the ACPI functionality and the system would not be shut down properly (got stuck when do a sudo reboot ). I also updated to the latest BIOS ( v2.80 ) from AsRock, that still does not fix the problem. Finally, later on that thread, somebody mentioned that we could just disable the GPE.L6F function by echoing disable to a specific pseudo file in /sys directory, hence the solution mentioned earlier.","tags":"errors","title":"ACPI Error: Method parse/execution failed [_GPE._L6F]"},{"url":"http://jhshi.me/2015/10/13/post-revision-plugin-for-pelican/index.html","text":"I used to have a post revision plugin for Octopress and I love it. Here is my effort to achieve the same effect in Pelican. For this purpose, I came up with this post-revision plugin and also a template in the pelican-bootstrap3 theme. The plugin generates some meta data for each article that contains the history information. And the templates consumes that meta data and present them in the HTML files. Post Revision Plugin The plugin itself is quite straightforward. The article or page's file path on the local file system can be accessed through the source_path attribute. After the site is generated ( page_generator_finalized and article_generator_finalized signal), we iterate through the articles and pages, and do a git log on them. For simplicity, right now we only store two piece of information for each Git commit: date and commit message title (the first line). The commits are stored as a list of (date, msg) tuple in reverse order as the history attribute. Optionally, if you specified GITHUB_URL and PROJECT_ROOT variables in the configuration file, this plugin also generates a link to the Github commit history page for the post, stored as github_history_url attribute. Templates Now we have the history meta data for the article, we can put it some where in the article template. I used the pelican-bootstrap3 theme. And the history information is in the post-revision.html template: 1 2 3 4 5 6 7 8 9 10 11 12 13 {% if article.history %} <section class= \"well\" id= \"related-posts\" > {% if article.github_history_url %} <h4><a href= {{ article . github_history_url }} > {{ POST_REVISION_TEXT|default('Post History') }} </a></h4> {% else %} <h4> {{ POST_REVISION_TEXT|default('Post History:') }} </h4> {% endif %} {% for date, msg in article.history[:POST_HISTORY_MAX_COUNT|default(5)] %} <b> {{ date|strftime(\"%A %I:%M %p, %B %d %Y\") }} </b><br/><p> {{ msg }} </p> {% endfor %} </section> {% endif %} Here we have some more settings: POST_REVISION_TEXT is the title of the post revision section. POST_HISTORY_MAX_COUNT controls how many history items to show. You can see a working example down below.","tags":"pelican","title":"Post Revision Plugin for Pelican"},{"url":"http://jhshi.me/2015/10/11/migrating-from-octopress-to-pelican/index.html","text":"Recently I migrated this blog site from Octopress to Pelican . Here is why and how. What's Wrong with Octopress? I have been using Octopress for a while (actually almost two years!) and it's been working great. In fact, I even wrote a few Octopress plugins myself (e.g, page-view , post-revision and popular-posts ) to make blogging easier. However, the major problem with Octopress is that building the site is super slow . Right now I have roughly 100 posts, and a build can take up to several minutes to finish (vs. 9 seconds in Pelican). And I just can not stand it any more. Additionally, Octopress is based on Jekyll , which uses Ruby, which I am never a fan of. And the author of Octopress promised to clean up the spaghetti layout of the repository , yet it seems takes forever to finish. Why Pelican ? Pelican has several great features that look very appeal to me: Using Python ---my favorite language. The framework is packaged cleanly as a single Python package, so I can use Virtualenv and all that great stuff from Python . Support reStructuredText and Markdown, so it's potently easy to migrate from Octopress. Because it uses Python , I might actually willing to fix a thing or two in case it breaks. Migration pelican-quickstart is great in set up a minimal working directory quickly. After copying the posts from source/_posts to content , there are a couple of things to take care of. YAML Front Matter Octopress uses YAML front matter for post meta data, such as title, date, tags, etc. Pelican can recognize most of them but tags and category . More specifically, In Octopress , you can put a post in multiple categories using the categories attribute. But in Pelican, one post can only in one category using the category attribute. This may cause trouble if you had some post in multiple categories in Octopress. Pelican can not recognize the YAML front matter for tags , which is very similar to a JSON array. For example, in Octopress, it's tags: [\"tag1\", \"tag2\"] . In Pelican, it's tags: tag1; tag2 . I fixed the first one by substituting all categories with category . Then I tried to manually convert the YAML style array to Pelican style array using sed , which failed miserably. Then I found the md-metayaml Pelican plugin, which was exactly what I needed. Just add md-metayaml to your PLUGINS and boom, Pelican can now recognize YAML front matter! Liquid Tags Octopress uses Liquid Tags for multimedia resources, such as images, videos, etc. Similar to YAML, there is also a liquid-tags plugin for Pelican. I mostly use the img tag, so I just added liqued_tags.img to the PLUGINS . You can add others as well, such as youtube , video . However, there is one tags that I used before that is missing in liquid-tags plugin--- blockquote . Fortunately, I only used this tag in one post and I happily convert it using the standard Markdown block quote syntax. Summary Text By default, Pelican uses a fixed number of words as the post summary. I prefer the way that Octopress handles summary: explicitly use a excerpt separator ( <!-- more --> ) to control which part goes to the post summary (typically first paragraph). Again, there is this summary plugin that does exactly as mentioned above. Just put summary in PLUGINS and set SUMMARY_END_MARKER to be <!-- more --> . URL Pattern This is probably just me: the URL pattern on this site is actually inherited from the old days when I was using WordPress. Basically, the post URL is something like /2015/10/11/title-slug/index.html , which corresponds to 2015-10-11-title-slug.markdown file in Octopress. First, we need to tell Pelican to obtain URL slug from the file name: 1 FILENAME_METADATA = '(?P<date>\\d{4}-\\d{2}-\\d{2})-(?P<slug>.*)' Then we set the article URL pattern: 1 2 ARTICLE_URL = '{date:%Y}/{date:%m}/{date: %d }/{slug}/' ARTICLE_SAVE_AS = '{date:%Y}/{date:%m}/{date: %d }/{slug}/index.html' Wrap It Up At this point, we have done most of the migrations. There are couple of more tweaks that makes Pelican works better: Using a theme (I used pelican-bootstrap3 ) Add some awesome plugins, such as related_posts , tag_cloud , sitemap , and tipue_search . Enable monthly and yearly archives. You can see a full Pelican configuration file here .","tags":"octopress","title":"Migrating from Octopress to Pelican"},{"url":"http://jhshi.me/2015/10/10/google-mobile-summit/index.html","text":"I have been attending the Google Mobile Summit for past two days. I'll share some of the exciting projects that Google is doing as well as my personal take away from them. Project Loon : LTE by ... Baloons! In a nutshell, this project wants to send a bunch of balloons to the stratosphere to act as LTE cellular towers. Personally, I feel like the most exciting and interesting part is to navigate the balloons by putting them up and down and leveraging the natural wind in stratosphere to move the balloons to designated area. I can imagine a lot of interesting research challenges there. In particular, I'm told that right now they need to obtain the wind direction data and prediction from third-party providers in order to plan the trajectory of the balloon. Instead, one might be able to learn the wind direction in an online fashion by dispatching balloons to different layers in the stratosphere for a short period of time. This may work because the wind directions in the stratosphere only change very slowly (in order of several hours), so such wind direction measurements can be usefully even performed very infrequently. Physical Web : Cyberphysical Stuff The essence of this project is to put a BLE beacon device in every physical \"things\" that we may want to interact with, such as parking meters, movie posters, etc. Theses beacons contain a short URL that will direct people various web interfaces. At first glance, the idea is very similar to QR codes, or NFC tags. And indeed, I think they are quite similar: a way to link physical beings to the cyber world. I was not quite convinced why BLE beacons are better than NFC tags in any meaningful ways. Maybe BLE is more pervasive? And NFC are usually absent in high-end smartphones with metal back lids? I am not quite sure I buy these arguments... Project Fi : AT&T, Verizon, Sprint, or... Google? This is probably the most interesting project in this summit. Basically Google wants to serve as a \"virtual cellular provider\" that unifies different physical providers (such as Sprint, AT&T) and choose whichever is better for you behind the scene. Most of all, it's cheap! $20/month base rate and $10/GB data rate, as simple as that. And they even refund you for unused data packages! How cool is that! Unfortunately, as of now you do need an invitation to join, and a latest Nexus phone (Nexus 5X, 6P and 6). From research perspective, this project also touches several know hard problems, most notably wireless handover, both between providers, and between Wifi and LTE. The later may be simpler of the two, since the device can simultaneously connect to Wifi and LTE to assess which one is better. However, since the device only have one cellular radio, it can only connect to one LTE provider at a time. Crowdsourcing may come in handy to determine the LTE quality based on locations, etc. Project Soli : Ant Radar They use 60GHz technologies to pack a tiny radar into wearable devices to enable touch less interaction. The demo is quite cool. I am always skeptical of such RF sensing stuff, now is Google is at it, maybe they can make it actually work... Project Iris : \"Smart\" Contact Lenses OK, now Google tries to mess up with your contact lenses :-) Basically, they developed this tiny tiny sensing system that can actually be embedded in the contact lenses. The lenses now measures the glucose level in the tears for early diabetes detection. It is amazing they may managed to pack so many stuff (sensors, battery, antenna) in such small form factor. My only concern is that...do the lenses heat up?","tags":"research","title":"Google Mobile Faculty Summit"},{"url":"http://jhshi.me/2015/09/23/build-aosp-5-dot-1-1-for-nexus-5/index.html","text":"I this post I will talk about the extra steps to build a usable Lollipop (5.1.1) ROM for LG Nexus 5 (hammerhead) device. Most of the functionalities work out of box (bluetooh, Wifi tethering, camera, etc), but there are some show-stoppers. LGE Vendor Blobs For some unknown reason, the official LGE vendor blobs does not work out of the box, at least for Sprint phones. More specifically: No cellular data connection. No Sprint hidden menu app. Can not update cellular profile and PRL. (Settings->More->Cellular Networks->Carrier Settings) I had this issue for KitKat before. Please refer to my previous post on how I resolved it last time. Long story short, I repeated the steps there and come up with a LGE vendor blob repo that fixes the problems mentioned above. Just clone the repo, put it in /vendor/lge/ directory in your AOSP root, and check out the for_android-5.1.1_r3 tag. Apparently the repo was built specifically for android-5.1.1_r3 tag from AOSP, but they should work for other 5.1.1 revisions as well. If not, just follow the steps in my previous post to update the binaries. Google Apps By default, AOSP does no contain any Google apps and services, there are many resources online. I put up a version in this repo which contains pretty much most of the major Google apps and services. A special note: do not attempt to push too many Gapps, otherwise you could easily exceed the 1GB limit on system partition size! Fused Location Provider Fused location provider let your phone get more accurate location much faster. It is provided through Google services so it is not enabled by default in AOSP. This patch enables fused location service (in device/lge/hammerhead ). diff --git a/overlay/frameworks/base/core/res/res/values/config.xml b/overlay/frameworks/base/core/res/res/values/config.xml index 8caef0c..a807ddc 100644 --- a/overlay/frameworks/base/core/res/res/values/config.xml +++ b/overlay/frameworks/base/core/res/res/values/config.xml @@ -287,4 +287,19 @@ <item>hsupa:4094,87380,704512,4096,16384,110208</item> </string-array> + <!-- Enable overlay for all location components. --> + <bool name=\"config_enableNetworkLocationOverlay\" translatable=\"false\">true</bool> + <bool name=\"config_enableFusedLocationOverlay\" translatable=\"false\">true</bool> + <bool name=\"config_enableGeocoderOverlay\" translatable=\"false\">true</bool> + <bool name=\"config_enableGeofenceOverlay\" translatable=\"false\">true</bool> + + <!-- + Sets the package names whose certificates should be used to + verify location providers are allowed to be loaded. + --> + <string-array name=\"config_locationProviderPackageNames\" translatable=\"false\"> + <item>com.google.android.gms</item> + <item>com.android.location.fused</item> + </string-array> </resources> Build Kernel In-Tree This is optional, but is a must if you want to do kernel development. Please refer to my previous post on how to integrate the kernel source into AOSP so that it gets built together with the rest of AOSP.","tags":"android","title":"Build AOSP 5.1.1 for Nexus 5"},{"url":"http://jhshi.me/2015/06/01/bypass-android-lockscreen-pin-code-using-recovery-and-adb/index.html","text":"One of the PhoneLab participants accidentally forgot the PIN code for his phone, thus can not access the phone at all. There are tremendous tutorials online on how to solve this. This is what I tested and worked. Since PhoneLab devices are flashed with Clockworkmod recovery, I reboot the phone (Nexus 5) into recovery mode, mount /system and /data partitions, adb shell into the phone, and delete these two files: /data/system/password.key and /data/system/gesture.key . Then the problem was fixed: no lock screen after powering on the phone. These are two methods that I tried yet failed . Both have something to do with the settings.db file. adb shell cd /data/data/com.android.providers.settings/databases sqlite3 settings.db update system set value = 0 where name = 'lock_pattern_autolock' ; update system set value = 0 where name = 'lockscreen.lockedoutpermanently' ; .quit And this one (from xda-developers forum ): adb shell sqlite3 /data/data/com.android.providers.settings/databases/settings.db sqlite> update secure set value = 65536 where name = 'lockscreen.password_type' ; sqlite> .exit # exit adb reboot","tags":"Android","title":"Bypass Android Lockscreen PIN Code Using Recovery and ADB"},{"url":"http://jhshi.me/2015/04/09/disable-vim-spell-checking-for-c-string/index.html","text":"Vim has great built-in spell checking. Even better, when editing source code files, it is smart enough to know not do spell checking in source code, which is quite neat. However, it will still do spell checking for string literals. Most of the times, this is not desired. This post shows how to tell VIM only do spell checking in comments when editing code files. Vim figures out which region of the contents need spell checking by inferring the syntax files. For example, for C files, the syntax file is located at /usr/share/vim/vim74/syntax/c.vim . There, you will find several lines that defines the cString region. One example is the following line: syn region cString start=+L\\=\"+ skip=+\\\\\\\\\\|\\\\\"+ end=+\"+ contains=cSpecial,@Spell extend You'll notice that at the end, it says @Spell , which tells VIM that the string literals need spell check. To override this behavior, and let VIM leave string literals alone when do spell checking, we can change the definition of cString in our own syntax files. For instance, you can put these lines in ~/.vim/after/syntax/c.vim : if exists(\"c_no_cformat\") syn region cString start=+L\\=\"+ skip=+\\\\\\\\\\|\\\\\"+ end=+\"+ contains=cSpecial else syn region cString start=+L\\=\"+ skip=+\\\\\\\\\\|\\\\\"+ end=+\"+ contains=cSpecial,cFormat endif if !exists(\"c_no_c11\") \" ISO C11 if exists(\"c_no_cformat\") syn region cString start=+\\%(U\\|u8\\=\\)\"+ skip=+\\\\\\\\\\|\\\\\"+ end=+\"+ contains=cSpecial else syn region cString start=+\\%(U\\|u8\\=\\)\"+ skip=+\\\\\\\\\\|\\\\\"+ end=+\"+ contains=cSpecial,cFormat endif endif They are almost identical to the default definition in /usr/share/vim/vim74/syntax/c.vim , just the trailing @Spell s are removed. This technique can also be applied to other languages, such as python or Java.","tags":"Vim","title":"Disable VIM Spell Checking for C String"},{"url":"http://jhshi.me/2015/01/19/fix-mac-address-clone-in-openwrt/index.html","text":"I used to be able to change the MAC address of WAN interface by specifying macaddr option in /etc/config/network . However, due to unknown reason , this no longer works in snapshot builds. Here is how to achieve the same effect using init scripts. In my router (TP-LINK WDR3500), eth1 is the WAN interface. Adjust this according to you case. First, verify that you can change WAN interface's MAC address using ifconfig . root@OpenWrt:~# ifconfig eth1 down root@OpenWrt:~# ifconfig eth1 hw ether XX:XX:XX:XX:XX:XX root@OpenWrt:~# ifconfig eth1 up root@OpenWrt:~# ifconfig eth1 Substitute XX:XX:XX:XX:XX:XX with the MAC address you want to clone, and check the output of the last command to make sure the new MAC address is used. Next we want to automatically override the MAC address when system boots up. We can use the init scripts. Edit /etc/init.d/clonemac and put the following content in it. #!/bin/sh /etc/rc.common # Copyright (C) 2014 OpenWrt.org START = 94 STOP = 15 start () { ifconfig eth1 down ifconfig eth1 hw ether XX:XX:XX:XX:XX:XX ifconfig eth1 up } stop () { echo \"Stop.\" } For details of OpenWrt init script, please check the document . Make the script executable, then we can change the MAC address simply by this: root@OpenWrt:~# /etc/init.d/clonemac start To execute the script automatically on system boot, we need to enable it: root@OpenWrt:~# /etc/init.d/clonemac enable This will create a symbolic link to the clonemac script in /etc/rc.d . Reboot the router and you will find the new MAC address be automatically used.","tags":"linux","title":"Fix MAC Address Clone in OpenWRT"},{"url":"http://jhshi.me/2014/12/31/benchmarking-android-file-system-using-iozone/index.html","text":"IOzone is a famous file system benchmark tool in *nix world. In this post, I'll show you how to port it to Android and how to use it to benchmark both flash and Ramdisk's performance. Build IOZone with AOSP I work on AOSP tree on daily basis, so it's handy for me to incorporate it into AOSP tree to take advantage of the AOSP tool chain. The key part is to come up with a appropriate Android.mk file so that it gets built along with other sub-projects of AOSP. First, download IOzone source tarball from its website. I'm using the latest tarball as of now (12/31/2014) with version 3.429. Then extract it to external/iozone --the usual place where we put all external upstream repos. Add a Android.mk file like this: LOCAL_PATH := $( call my-dir ) include $(CLEAR_VARS) OBJS = iozone.o libbif.o ALL = iozone %.o : %. c @ $( NQ ) ' CC ' $@ $( Q )$( CC ) $( CFLAGS ) -c -o $@ $< iozone : $( OBJS ) $( CC ) $( LDFLAGS ) $( OBJS ) -lrt -lpthread -o iozone LOCAL_SRC_FILES := $( patsubst %.o,%.c, $( OBJS )) LOCAL_CFLAGS += -Wall -DANDROID -DO_RSYNC = 0 -DNO_THREADS LOCAL_CFLAGS += -O3 -Dunix -DHAVE_ANSIC_C -DNAME = '\"linux-arm\"' -DLINUX_ARM -Dlinux LOCAL_C_INCLUDES := $( KERNEL_HEADERS ) LOCAL_LDFLAGS := -Wl,--no-fatal-warnings LOCAL_MODULE_TAGS := eng LOCAL_SHARED_LIBRARIES := libc LOCAL_LDLIBS += -lpthread LOCAL_MODULE := iozone include $(BUILD_EXECUTABLE) Changes against the original Makefile that comes with the source code are: Do not build fileop.c , libasync.c and pit_server.c . They're not compatible with AOSP source and we will not use them anyway. Define ANDROID in CFLAGS , which we'll use for some minor changes to the source code later. Define O_RSYNC , somehow this flag definition is missing in AOSP's fcntl.h . The second part of CFLAGS is copied from the original Makefile 's linux-arm target. Add user space kernel headers to include path. Add libc and libpthread . Then we need to modify the source code a little bit to cope of AOSP's header files. Changes for iozone.c : diff --git a/iozone.c b/iozone.c index 85fdea0..36de106 100644 --- a/iozone.c +++ b/iozone.c @@ -403,8 +403,12 @@ typedef long long off64_t; #include <sys/time.h> #ifdef SHARED_MEM +#ifdef ANDROID +#include <linux/shm.h> +#else #include <sys/shm.h> #endif +#endif #if defined(bsd4_2) && !defined(MS_SYNC) #define MS_SYNC 0 Changes for libbif.c : diff --git a/libbif.c b/libbif.c index 890e226..f997e74 100644 --- a/libbif.c +++ b/libbif.c @@ -17,7 +17,7 @@ #include <sys/types.h> #include <stdio.h> #include <sys/file.h> -#if defined(__AIX__) || defined(__FreeBSD__) || defined(__DragonFly__) +#if defined(__AIX__) || defined(__FreeBSD__) || defined(__DragonFly__) || defined(ANDROID) #include <fcntl.h> #else #include <sys/fcntl.h> Finally, add iozone to your PRODUCT_PACKAGES so that it gets built when you do make in AOSP root directory. Benchmark Results IOZone has a bunch of options. You can find the full document here . The options I used in this benchmark are: -a : auto mode. -z : test all record size. In particular, for larger files, test with small record sizes (4K, 8K, etc.) -n 4k : specify minimum file size to test. -g 512m : specify maximum file size to test. -e : include fsync and fflush when calculating time. -o : force write synchronously to disk. -p : purge cache before each file operation. -f : specify test file path. I tested with both /sdcard/test.bin for flash and /mnt/asec for Ramdisk (or tmpfs). The smartphone I used is Nexus 5 (hammerhead) running Android 4.4.4 KitKat. Here are the results: Flash Read: Flash Write: Ramdisk Read: Ramdisk Write: We can see that: The overall bandwidth with flash fluctuates a lot with different file or record size. While the bandwidths for Ramdisk are quite stable. As expected, the read throughput of flash is much better than write. The bandwidth of Ramdisk can be faster than flash in order of magnitudes. One particularly interesting phenomena is that, for flash read, when the record size is equal to the file size (4k-16M), the bandwidth is ridiculously high (~500MB/s). Not sure about the reason yet.","tags":"Android","title":"Benchmark Android File System Using IOzone"},{"url":"http://jhshi.me/2014/11/16/mailman-configuration-with-nginx-plus-fastcgi-plus-postfix-on-ubuntu/index.html","text":"Here are the steps and caveats to setup a proper mail list on Ubuntu server. The instructions are are for Ubuntu 14.04 LTS, and should be easy to adapt for other platforms. Assumptions Here we assume the following: You have an domain, example.com . You want to mail list running at machine with host name lists.example.com . The mail list address should look like somelist@example.com . You have setup the DNS MX record for example.com to point to lists.example.com . Please use MX Toolbox to double check. You already have a Nginx server up and running at lists.exmaple.com . Background Before we dive in the setup, here is the role of each tool: Nginx: HTTP server, provide Mailman web interface. FastCGI: CGI tool, dynamically generate Mailman HTML pages. Postfix: Mail Transfer Agent, we use it to actually send and receive emails. Mailman: Mail list tool, member management. Suppose you send a email to somelist@example.com . Here are what will happen: You email provider, say Gmail, queries the MX record of example.com , figure out is actually the IP address of lists.example.com . Gmail send the email to lists.example.com . Postfix receives the email, and route this to Mailman. Mailman figure out who are in this list, then tell Postfix to forward the email to them. List members receive this email sent by Postfix. Package Installation FastCGI $ sudo apt-get install fcgiwrap Open /etc/init.d/fcgiwrap , make sure FCGI_USER and FCGI_GROUP are both www-data . Mailman $ sudo apt-get install mailman During installation, choose language support, say en . The instructions will also tell you to create a mailman list. Do NOT do this yet , we will create the list later, after we configured mailman properly. Postfix $ sudo apt-get install postfix # or this if you have installed postfix $ sudo dpkg-reconfigure postfix Make sure you choose the following: General type of mail configuration: Internet Site System mail name: example.com (without lists ) Root and postmaster mail recipient: you Linux user name on lists.example.com Other destinations to accept mail for: make sure example.com is there. Force synchronous updates on mail queue: No. Local networks: make sure example.com is there. Mailbox size limit: 0. Local address extension character: + (the plus sign). Internet protocols to use: all. Nginx Configuration In /etc/nginx/fastcgi_params , comment out this line: fastcgi_param SCRIPT_FILENAME $ request_filename ; Suppose your web server is configured in /etc/nginx/sites-available/www , add these lines to you server configuration: location / mailman { root / usr / lib / cgi - bin ; fastcgi_split_path_info ( &#94;/ mailman / [ &#94; / ] + )( /.* ) $ ; fastcgi_pass unix :/// var / run / fcgiwrap . socket ; include / etc / nginx / fastcgi_params ; fastcgi_param SCRIPT_FILENAME $ document_root $ fastcgi_script_name ; fastcgi_param PATH_INFO $ fastcgi_path_info ; } location / images / mailman { alias / usr / share / images / mailman ; } location / pipermail { alias / var / lib / mailman / archives / public ; autoindex on ; } Restart Nginx server, you should be able to see the web page at http://lists.example.com/mailman/listinfo $ sudo service nginx restart Mailman Configuration Open /etc/mailman/mm_cfg.py , modify these lines: DEFAULT_URL_PATTERN : should be http://%s/mailman . DEFAULT_EMAIL_HOST : should be example.com . DEFAULT_URL_HOST : should be lists.example.com . Postfix Configuration Open /etc/postfix/main.cf , make sure these lines are correct: mydomain = example.com myhostname = lists. $ mydomain myorigin = /etc/mailname mydestination = $ mydomain localhost. $ mydomain $ myhostname localhost mynetworks = $ mydomain 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128 alias_maps = hash:/etc/aliases alias_database = hash:/etc/aliases local_recipient_maps = proxy:unix:passwd.byname $ alias_maps local_recipient_maps tells Postfix how to route the emails. If you use Sendgrid for outgoing emails, also add these lines: smtp_sasl_auth_enable = yes smtp_sasl_password_maps = static:yourSendGridUsername:yourSendGridPassword smtp_sasl_security_options = noanonymous smtp_tls_security_level = encrypt header_size_limit = 4096000 relayhost = [smtp.sendgrid.net]:587 Create the First Mail List Ok, now we pretty much configured everything. Let's create the first email list called mailman , which will be used for Mailman logistics (like email reminders). $ sudo newlist mailman # Enter you email address and password It will tell you to paste this lines to /etc/aliases . ## mailman mailing list mailman: \"|/var/lib/mailman/mail/mailman post mailman\" mailman-admin: \"|/var/lib/mailman/mail/mailman admin mailman\" mailman-bounces: \"|/var/lib/mailman/mail/mailman bounces mailman\" mailman-confirm: \"|/var/lib/mailman/mail/mailman confirm mailman\" mailman-join: \"|/var/lib/mailman/mail/mailman join mailman\" mailman-leave: \"|/var/lib/mailman/mail/mailman leave mailman\" mailman-owner: \"|/var/lib/mailman/mail/mailman owner mailman\" mailman-request: \"|/var/lib/mailman/mail/mailman request mailman\" mailman-subscribe: \"|/var/lib/mailman/mail/mailman subscribe mailman\" mailman-unsubscribe: \"|/var/lib/mailman/mail/mailman unsubscribe mailman\" Then update the /etc/aliases.db database: $ sudo newaliases Then restart Mailman and Postfix: $ sudo service postfix restart $ sudo service mailman restart Now if you go to http://lists.example.com/mailman/listinfo , you should see the newly created Mailman list. You can continue by adding other lists, and send test emails to these lists. About Aliases The /etc/aliases file tells Postfix how to route the emails. In above mailman example, when receiving emails to mailman@example.com , Postfix will know to call the command /var/lib/mailman/mail/mailman post mailman . You can also tell Postfix to forward certain emails to another email address. For example: help : example . help @ gmail . com Then if you send a email to help@example.com , Postfix will forward it to example.help@gmail.com .","tags":"Linux","title":"Mailman Configuration with Nginx+FastCGI+Postfix on Ubuntu"},{"url":"http://jhshi.me/2014/11/09/monitor-screen-touch-event-in-android/index.html","text":"In one of my projects I need to track every screen touch event in background. That is, my app needs to be \"invisible\" while capturing every screen touch. Here is how I achieved this. The idea is to define a dummy UI fragment that is really tiny (say, 1x1 pixel), and place it on one of the corners of the screen, and let it listen on all touch events outside it. Well, literally, it's not \"invisible\", in fact it's in foreground all the time! But since it's so tiny so hopefully users won't feel a difference. First, let's create this dummy view: mWindowManager = ( WindowManager ) mContext . getSystemService ( Context . WINDOW_SERVICE ); mDummyView = new LinearLayout ( mContext ); LayoutParams params = new LayoutParams ( 1 , LayoutParams . MATCH_PARENT ); mDummyView . setLayoutParams ( params ); mDummyView . setOnTouchListener ( this ); Here we set the width of the dummy view to be 1 pixel, and the height to be parent height. And we also set up a touch event listen of this dummy view, which we'll implement later. Then let's add this dummy view. LayoutParams params = new LayoutParams ( 1 , /* width */ 1 , /* height */ LayoutParams . TYPE_PHONE , LayoutParams . FLAG_NOT_FOCUSABLE | LayoutParams . FLAG_NOT_TOUCH_MODAL | LayoutParams . FLAG_WATCH_OUTSIDE_TOUCH , PixelFormat . TRANSPARENT ); params . gravity = Gravity . LEFT | Gravity . TOP ; mWindowManager . addView ( mDummyView , params ); The key here is the FLAG_WATCH_OUTSIDE_TOUCH flag, it enables the dummy view to capture all events on screen, whether or not the event is inside the dummy view or not. Finally, let's handle the touch event by implementing View.OnTouchListener listener. @Override public boolean onTouch ( View v , MotionEvent event ) { Log . d ( TAG , \"Touch event: \" + event . toString ()); // log it return false ; } We need to return false since we're not really handling the event, so that the underlying real UI elements can get those events. A final note is that, to keep our dummy view always listening touch events, we need to wrap all these in an Service : we create the dummy view in onCreate and add it to screen in onStartCommand . And the service should implement View.OnTouchListener to receive the touch events.","tags":"Android","title":"Monitor Screen Touch Event in Android"},{"url":"http://jhshi.me/2014/11/09/aosp-release-tools/index.html","text":"AOSP ships with a bunch of tools that are very useful for platform release. I'll cover their usage and explain what they do in this post. Generate Target Files Usually when you develop locally, you would use plain make with no particular target to compile AOSP. When you prepare for release, however, you need to do this instead: $ make -j16 dist It will first compile the whole source tree, as a plain make does. Then it will generate several zip files in out/dist that will be used in later stage of release. Here are the files for Nexus 5 (hammerhead) of platform version 1.2, the names may be slightly different in your case. aosp_hammerhead-target-files-1.2.zip contains all the target files (apk, binaries, libraries, etc.) that will go into the final release package. This is the most important file and will be used extensively later on. aosp_hammerhead-apps-1.2.zip contains all the apks. aosp_hammerhead-emulator-1.2.zip contains images that suitable for boot on a emulator. aosp_hammerhead-img-1.2.zip contains image files for system , boot , and recovery . Suitable for fastboot update . aosp_hammerhead-ota-1.2.zip is an OTA package that can be installed through recovery. aosp_hammerhead-symbols-1.2.zip contains all files in out/target/product/hammerhead/symbols . Sign Target Files Each APK in the final release has to be properly signed. In each Java project that will finally generate an APK, developers can specify which key should be used to sign this apk by defining LOCAL_CERTIFICATE . For example, in Android.mk file of packages/apps/Settings , there is this line: LOCAL_CERTIFICATE := platform Which indicates that Settings.apk should be signed using platform key. You can also set LOCAL_CERTIFICATE to be PRESIGNED , which tells the signing script (see below) that this APKs are already signed and should not be signed again. This is usually the case when those APKs are provided as vendor blobs. There are four type of keys in AOSP, and the default keys are shipped in build/target/product/security . As you'll find in the README file, they are: testkey -- a generic key for packages that do not otherwise specify a key. platform -- a test key for packages that are part of the core platform. shared -- a test key for things that are shared in the home/contacts process. media -- a test key for packages that are part of the media/download system. Actually, after first step ( make dist ) the target APK files are signed with this keys, which we should substitute to our own keys in this step. AOSP provides a python script, build/tools/releasetools/sign_target_file_apks , for this purpose. You can take a look at the python doc at the head of that file for complete usage. A typical usage will look like this: $ ./build/tools/releasetools/sign_target_file_apks -o -d $KEY_DIR out/dist/aosp_hammerhead-target_files-1.2.zip /tmp/signed.zip In which: -o tells the script to replace ota keys. This will make system/etc/security/otacerts.zip in the final image contain your platform keys instead of the default one. -d indicates that you're using default key mapping. $KEY_DIR should be the directory that contains your private keys. This script will first unpack the input target files, then sign each APKs using proper keys, and repack them in to a new signed target files zip. Generate Release File This step depends on what kind of release file you want to generate. You can either generate a full image file that suitable for fastboot update , or you can generate an OTA file that can be updated via recovery. Full System Image $ ./build/tools/releasetools/img_from_target_files /tmp/signed.zip /tmp/final-release.img This script will pack the signed target files into one image file that can be flashed via fastboot update . This is useful when you do your first release. OTA Package For OTA, you can choose from a full OTA or an incremental OTA. In each case, you can reboot the device into recovery mode, and use adb sideload to flash the update for testing. To generate a full OTA package: $ ./build/tools/releasetools/ota_from_target_files -k $KEY_DIR /platform /tmp/signed.zip /tmp/final-full-ota.zip In which -k option specify the key to sign the OTA package. The package contains all the files needed by system , boot and recovery partition. Incremental OTA The OTA package generated in last step is quite large (~380MB for KitKat). If the changes since last release are not that many, then you may want to generate an incremental OTA package, which only contains the different part. To do this, you need the signed target files from last time when you do a release. Therefore, I strongly suggest you to check in the signed target files of each release in your VCS, just in case in the future you want to do an incremental OTA. $ ./build/tools/releasetools/ota_from_target_files -k $KEY_DIR /platform -i /tmp/last-signed.zip /tmp/signed.zip /tmp/final-full-ota.zip The difference is that we specify the base target files, /tmp/last-signed.zip . The script will compare current target files with the one from last release, and will generate binary diff if they're different. You may also check my previous post about how apply the OTA package programmingly .","tags":"Android","title":"AOSP Release Tools"},{"url":"http://jhshi.me/2014/11/08/replicate-gem-installation/index.html","text":"I use Octopress to manage my blogs, which rely on correct ruby gem version to work. Although Octopress use Bundler to manage the gem dependencies, sometimes a simple bundle install does not work out of box. Since everything works fine on one of my machines, I decided to replicate the exact ruby/gem setup of that machine. Dump Gem list First, dump all the gems and version to a text file on the machine that you want to replicate. $ gem list | tail -n+1 | sed 's/(/--version /' | sed 's/)//' > gemlist Here we first dump all the gem files using gem list , then we remove the first line of the output ( ***LOCAL GEMS*** ), and replace left parenthesis with --version for later convenience, and remove right parenthesis. Suppose your app use Bundler, then you should use this command instead of the above one, to make sure the we install exactly the same set of gems for that app. $ bundle | head -n-2 | cut -d ' ' -f2,3 | sed 's/ / --version /' > gemlist Here since the output of bundle is a bit different with gem list , we first remove the last two lines of the output (see below), then we split each line using white space and only get the second (gem name) and third (version) parts, finally we substitute white space with --version , similar as above. .... Using stringex 1.4.0 Using bundler 1.7.3 Your bundle is complete! Use `bundle show [gemname]` to see where a bundled gem is installed. Install the Gems Copy the gemlist file to the machine that you want to install gems on, and use this command to install the gems. To make sure we have a clean slate, we first remove all Gems first. $ gem list | cut -d \" \" -f1 | xargs sudo gem uninstall -aIx Then install all the Gems, here we do not install document for sake of time. $ cat gemlist | xargs -L 1 sudo gem install --no-ri --no-rdoc Here we use -L 1 option to tell xargs to treat each line as a separate command. Finally, before you do rake in your project, remember to delete the Gemfile.lock file, it may contain some obsolete gems and misleading bundler.","tags":"linux","title":"Replicate Gem Installation"},{"url":"http://jhshi.me/2014/10/06/install-acrobat-reader-on-ubuntu-14-dot-04/index.html","text":"Recently I need to install Adobe Acrobat Reader on couple of my Ubuntu boxes. The process is full of black magic that sometimes you can't find the documents anywhere. Hopefully this post will make the process less a pain. Install Dependencies This is probably a superset of all the dependencies. I used trial and error in the process and I'm not quite sure which packages are really necessary... With these packages, if I run acroread from command line: There is no GTK warnings or whatsoever. Can open PDF with forms (like the ones for Canada VISA application). The icons and menus looks \"normal\". Note that since the Acrobat Reader is 32-bit application, so if you're on 64-bit system, remember to append a :i386 on whatever extra packages you want to install besides the ones in this list. sudo apt-get install \\ libgtk2.0-0:i386 \\ libnss3-1d:i386 \\ libnspr4-0d:i386 \\ lib32nss-mdns \\ libxml2:i386 \\ libxslt1.1:i386 \\ libstdc++6:i386 \\ libcanberra-dev:i386 \\ libcanberra-gtk-dev:i386 \\ libcanberra-gtk-module:i386 \\ gtk2-engines:i386 \\ gtk2-engines-*:i386 \\ gnome-themes-standard:i386 \\ unity-gtk2-module:i386 \\ libpangoxft-1.0.0:i386 \\ libpangox-1.0.0:i386 \\ libidn11:i386 \\ dconf-gsettings-backend:i386 Download the Deb Package cd ~/Downloads && wget -c http://ardownload.adobe.com/pub/adobe/reader/unix/9.x/9.5.5/enu/AdbeRdr9.5.5-1_i386linux_enu.deb If the above link fails, try this one instead: cd ~/Downloads && wget -c ftp://ftp.adobe.com/pub/adobe/reader/unix/9.x/9.5.5/enu/AdbeRdr9.5.5-1_i386linux_enu.deb Installation sudo dpkg -i ~/Downloads/AdbeRdr9.5.5-1_i386linux_enu.deb This should complete without any problems if you installed all the packages in the dependencies. But in case dpkg still complains, run this command after dpkt -i : sudo apt-get -f install This should fix any further missing dependencies. Configuration If you want to use Acrobat Reader to open PDF files by default. Run this command and choose Acrobat Reader from the list: mimeopen -d *.pdf Please choose a default application for files of type application/pdf 1 ) Document Viewer ( evince ) 2 ) Print Preview ( evince-previewer ) 3 ) GIMP Image Editor ( gimp ) 4 ) Adobe Reader 9 ( AdobeReader ) 5 ) Other... use application # 4","tags":"linux","title":"Install Acrobat Reader on Ubuntu 14.04"},{"url":"http://jhshi.me/2014/10/03/regular-expression-support-in-android-logcat-tag-filters/index.html","text":"For a while I've been using logcat command line tools to check Android logs. Usually, the tags of my app consist of a common prefix and the name of different sub-components (I guess that's also what most apps do). And I have about a dozen of such tags. logcat , however, does not support filtering tags using regular expressions, which is a pain! After suffering for a long time, I finally decide to tackle this. Logcat Tag Filters The basic logcat options can be found in the official document , which also contains a brief explanation of logcat filter format . Basically, you provide a series of logcat filters, each with the format TAG:LEVEL , where TAG must be the exact tags you want to filter, and LEVEL can be one of the characters listed in the document. So if you have a bunch of similar tags, such as MyApp-task1 , MyApp-Task2 , etc., you'll have to specify them all in full name. Although you can save a few key strokes by setting the ANDROID_LOG_TAGS environment variable, it still only solves part of the pain. Note that the order of the filters matters. In short, logcat will look at the filters from left to right, and use the first one that matches the tag of log line. For example, if you use MyApp:V *:S , then only the log lines with tag MyApp will be printed, other log lines will be suppressed by the *:S filter. However, if you use *:S MyApp:V , then no log lines will be printed, because the first filter, *:S matches all log line tags thus all log lines are suppressed by this filter. For details, please refer to the android_log_shouldPrintLine function in this file . My Logcat Wrapper We can make logcat support regular expression tag filters by two approaches. One is modifying logcat source code in AOSP tree and build a new logcat binary that support RE. Another approach is to filter the tags \"offline\" in the host PCs where you run adb logcat command, i.e., a logcat wrapper. I adopted the second approach, since the first one has a few drawbacks: You'll have to match RE in cpp, which I assume is not quite enjoyable. You'll have to cross-compile the logcat binary, which requires you to setup the whole AOSP develop environment. For each device that you want to run logcat on, you'll have to replace the logcat binary. So here is my wrapper works. It calls adb logcat command without any filters, to get all the log lines. Then it parses the output log lines, and only prints the lines whose tag matches the regular expression provided. It supports basic logcat options, such as -b , -c , -g , etc. by just piping those options to the real logcat . It processes the log filters in the same order as logcat does to be as close as the original logcat semantics. The idea is that you just run the wrapper in the save way you would as logcat , and it just does the magic RE tag filtering for you. You can find this tool on my github repo . View Logs You can directly print those log lines to console. I personally prefer to redirect them to a temporary file and use vim to view it, which give me features like incremental highlight search, etc. There is a sweet recipe which tells vim to automatically refresh the buffer when it's modified outside. This is a perfect fit in viewing log files.","tags":"android","title":"Regular Expression Support in Android Logcat Tag Filters"},{"url":"http://jhshi.me/2014/09/21/get-packet-signal-strength-of-rtl8187-dongle/index.html","text":"In one of my research projects, I used Android PCAP Capture with ALFA RTL8187L dongles to capture Wi-Fi packets on Android phones. One problem I encountered was that per packet RSSI is missing. After poking around the source code for couple of days, I finally figured out how to get this information. In short, the per packet RSSI information IS indeed reported by the hardware, yet the current Android PCAP app doesn't collect it. RTL8187 Rx Descriptor Normally, the Wi-Fi chipset will report certain PHY layer information (RSSI, AGC, etc.) along with the true 802.11 packet in the form of a vendor \"header\". In the case of RTL8187L, it's a bit confusing because the \"header\" is actually at the end of the delivered packet. This is the detailed format of RTL8187 Rx descriptor (p.25 of the datasheet ). The most interesting part related to signal strength is AGC and RSSI. They all, in a way, reflect the signal quality of the received packet. However, as per the Linux kernel rtl8187 driver , \"none of these quantities show qualitative agreement with AP signal strength, except for the AGC\". We'll worry about this later. For now, we focus on how to extract these values from the packet. Get the Values In PCAP capture source code ( RTL8187Card.java ), there is a usbThread which keep pulling data from the dongle. When got a packet, the last 16 or 20 bytes are trimmed depending on if it's RTL8187L or RTL8187B. That 16 or 20 bytes are the Rx descriptor. So instead of truncating them, we'll save them in a separate array. diff --git a/src/net/kismetwireless/android/pcapcapture/Rtl8187Card.java b/src/net/kismetwireless/android/pcapcapture/Rtl8187Card.java index b8e1a44..7628446 100644 --- a/src/net/kismetwireless/android/pcapcapture/Rtl8187Card.java +++ b/src/net/kismetwireless/android/pcapcapture/Rtl8187Card.java @@ -1868,13 +1868,16 @@ public class Rtl8187Card extends UsbSource { // int sz = mBulkEndpoint.getMaxPacketSize(); int sz = 2500; byte[] buffer = new byte[sz]; + byte[] header; while (!stopped) { int l = mConnection.bulkTransfer(mBulkEndpoint, buffer, sz, 1000); int fcsofft = 0; + header = null; if (l > 0) { if (is_rtl8187b == 0 && l > 16) + header = Arrays.copyOfRange(buffer, l-16, l); l = l - 16; else if (l > 20) l = l - 20; @@ -1889,6 +1892,11 @@ public class Rtl8187Card extends UsbSource { if (mPacketHandler != null) { Packet p = new Packet(Arrays.copyOfRange(buffer, 0, l)); p.setDlt(PcapLogger.DLT_IEEE80211); + if (header != null) { + int noise = header[4]; + int rssi = header[5] & 0x7f; + int agc = header[6]; + } /* if (fcs) Here, we save the RTL8187L header in a separate byte array, and get the relevant fields from it. Meaningful RSSI Although the Linux kernel driver shed some light on how to get a meaningful RSSI out of the RTL8187L header, in my experiment, I found that RSSI-100 is a fair enough approximation of the real RSSI in dBm . For example, if the RSSI field value is 15, then the actual RSSI is 15-100=-75dBm. Sometimes this approach will give you some strange RSSI values (e.g., positive), yet most of the time the calculated values are quite meaningful, and the RSSI of beacon frames calculated this way are consistent with what you'll get from Android scan results.","tags":"android","title":"Get Packet Signal Strength of RTL8187 Dongle"},{"url":"http://jhshi.me/2014/08/28/sign-issues-related-to-ota-update/index.html","text":"In my previous posts, I explained how to create a properly signed OTA package that will pass recovery's signature check, and how to verify the signed OTA package before apply it . Here, we'll discuss, when building an production AOSP platform , how to sign the platform and recovery image properly to match those signature checks. In following discussions, we assume you have a key pairs: platform.x509.pem and platform.pk , which you'll use to sign the OTA package. Suppose the keys are stored in a directory with path $KEYS . I'm using Nexus 5 (hammerhead) as an example below but the practice should be easy to apply to other devices. Platform OTA Certificates When verify a OTA package's signature using Android's RecoverySystem.verifyPackage utility, that function actually checks against the certificates stored in /system/etc/security/otacerts.zip . So if you want to push OTA updates later, you'll have to generate the proper certificates when building the platform. You can accomplish this by specifying PRODUCT_OTA_PUBLIC_KEYS in your device's Makefile ( device/lge/hammerhead/full_hammerhead.mk in my case). PRODUCT_OTA_PUBLIC_KEYS := $ KEYS Then the building process will store this location in META/otakeys.txt in unsigned zip file. When you sign the target files using sign_target_files_apks tool, it will generate the proper ota certificates based on the otakeys provided. If PRODUCT_OTA_PUBLIC_KEYS is not defined, it will just use the release key, which is probably not what you used to sign the OTA packages. Recovery Signature Verification When you programmingly apply a OTA package using RecoverySystem.installPackage function, it will boot the device into recovery mode and let the recovery do the update. The recovery will first check the signature of the OTA package. So when building the platform, you'll also need to specify the extra recovery keys by defining PRODUCT_EXTRA_RECOVERY_KEYS . PRODUCT_EXTRA_RECOVERY_KEYS := $ KEYS After setting PRODUCT_OTA_PUBLIC_KEYS and PRODUCT_EXTRA_RECOVERY_KEYS , you should be able to pass all signature verifications and successfully apply the OTA update.","tags":"android","title":"Signing Issues related to OTA Update"},{"url":"http://jhshi.me/2014/07/11/print-uint64-t-properly-in-c/index.html","text":"stdint.h provides various machine-independent integer types that are very handy to use, especially the uint64_t family. You would assume it's something like long long unsigned int and tempted to use %llu in printf , which, however, will be reported as a warning by any decent compiler. warning : format '%llu' expects argument of type 'long long unsigned int' , but argument 4 has type 'uint64_t' [- Wformat ] The Right Way The right way to print uint64_t in printf / snprintf family functions is this ( source ): #define __STDC_FORMAT_MACROS #include <inttypes.h> uint64_t i ; printf ( \"%\" PRIu64 \" \\n \" , i ); PRIU64 is a macro introduced in C99, and are supposed to mitigate platform differences and \"just print the thing\". More macros for printf family can be found here . The Story In my case, I mistakenly use %lu to print a uint64_t integer. Of course, the compiler gave warning on this. But...you know, it's \"just warnings\", should be no big deal. Well, 80% of the time it is fine. Yet this time, it's not. Since uint64_t takes 8 bytes but %lu will only eat 4 bytes, so my next print argument, %s comes in and happily print who knows what... Never ignore warnings, NEVER.","tags":"linux","title":"Print uint64_t Properly In C"},{"url":"http://jhshi.me/2014/07/02/fix-adb-permissions/index.html","text":"I've been bothered by this message for a while when the device is in recovery mode. $ adb devices List of devices attached ???????????? no permissions The thing is, I've set up my udev rules according to official AOSP building guide , and it works fine in normal mode. Yet the above message shows up when the device is put in recovery mode. There are some solutions online saying that restarting ADB as root, which I don't think is a very good idea. Then I figured if it has to do with my udev rules, maybe it didn't contain the device I used (Nexus 5 from LG). A lsusb with device in recovery mode gives me this: $ lsusb Bus 002 Device 010: ID 18d1:d001 Google Inc. The first part of the ID (18d1) is supposed to be the vendor ID, and second part (d001) is product ID. However, from Google's vendor list , LG's vendor ID should be 1004, where as Google's vendor ID is 18d1. What the hell, just add them to the udev rules: # adb protocol on recovery for Nexus 5 SUBSYSTEM==\"usb\", ATTR{idVendor}==\"18d1\", ATTR{idProduct}==\"d001\", MODE=\"0600\", OWNER=\"<YOUR_USER_NAME>\" After that, unplug the device and plug it in again. It should be recognized, like so: $ adb devices List of devices attached 060fb526f0eca244 recovery And this approach can be extended to any cases where either adb or fastboot has permission issues. Just do a lsusb and find out the actual vendor and product ID, and add them to your udev rules.","tags":"android","title":"Fix ADB Permissions"},{"url":"http://jhshi.me/2014/06/30/stop-android-logcat-from-truncating-log-line/index.html","text":"When analyzing the logcat data from Android devices, I found that sometimes the log line get truncated, especially when it's quite long. This causes trouble because the logged information are in JSON format, which gets broken after (even one character) truncated. In this post, I'll explain how did the truncation happen, and how to fix it. Android Logging This page gives an detailed explanation of Android logging system. In short, three parts are working together to make Android logcat work. logger device driver in kernel ( kernel/drivers/stagging/android ). Which serves read/write request from user space and also buffer the log content. android.util.Log class ( framework/base/core/java/android/util/Log.java ), a Java wrapper to write to logger device. logcat ( system/core/log ), a native tool to read logs from logger device. Truncating Let's follow the flow when Log.v is called with a log message, and find out who truncated the log message (if it's too long). In framework/base/core/java/android/util/Log.java , when Log.v is called, it just call the native method called println_native with to extra arguments, LOG_ID_MAIN and VERBOSE . The first specify the log device to write to, and the second tells the log level. In println_native , defined in framework/base/core/jni/android_util_Log.cpp , it just calls the function named __android_log_buf_write . So far, nobody changed the log message yet. __android_log_buf_write is defined in system/core/liblog/logd_write.c , it first detect a few special tags to redirect them to radio log device, and then it packs the log message in to struct iovec data structures and passes them on to write_to_log , which is initialized as _write_to_log_kernel . Eventually, these iovec go to writev in system/core/liblog/uio.c , which call syscall write on the log device. Thus, log line content is still sane before entering kernel space. Next, the write request will be directed to logger_aio_write function defined in kernel/drivers/staging/android/logger.c . One line (462) raised my attention: header . len = min_t ( size_t , iocb -> ki_left , LOGGER_ENTRY_MAX_PAYLOAD ); This is where the truncating happens! How to Fix LOGGER_ENTRY_MAX_PAYLOAD is defined in kernel/drivers/stagging/android/logger.h as 4076 , which I guess is (4096-20), where 20 is the log header structure size. We can not actually eliminate truncating completely, the buffer size is limited after all. But we can enlarge the payload limit a bit to prevent some unnecessary truncating. I changed it to 65516 (65536-20), which should be large enough. Also, logger device maintains a ring buffer for each log device, which are defined in kernel/drivers/stagging/android/logger.c . The default buffer size is 256K. I changed the buffer size for main device to 4MB, while leave others unchanged. (I also tried 32MB, yet apparently it's far too large and the kernel refused to boot up.) UPDATE To make Android logcat tool working properly, we'll also need to modify system/core/include/log/logger.h in AOSP source tree, which is a mirror to the logger.h in kernel. LOGGER_ENTRY_MAX_PAYLOAD needs to be the same with the one in kernel, and LOGGER_ENTRY_MAX_LEN needs to be a bit larger than LOGGER_ENTRY_MAX_PAYLOAD . In my case, I set the former to 65516 and latter to (64*1024) .","tags":"android","title":"Stop Android Logcat from Truncating Log Line"},{"url":"http://jhshi.me/2014/06/30/build-kernel-in-tree-with-aosp-for-nexus-5-hammerhead/index.html","text":"Google has a fair document for building kernel for Android . Yet it didn't cover how to integrate the kernel with AOSP source tree so that kernel gets built along with whole platform, which I'll explain in this post. Here I'll mainly focus on android-4.4.4_r1 (Kitkat) for Nexus 5 ( hammerhead ). The instructions should be easy to adapt to other models or AOSP releases. Determine Kernel Version The best and safest way to determine the right kernel version you need is to examine the pre-included kernel image. For hammerhead, it's in device/lge/hammerhead-kernel/ . $ bzgrep -a 'Linux version' device/lge/hammerhead-kernel/vmlinux.bz2 Linux version 3.4.0-gd59db4e ( android-build@vpbs1.mtv.corp.google.com ) ( gcc version 4.7 ( GCC ) ) #1 SMP PREEMPT Mon Mar 17 15:16:36 PDT 2014 As per this stackoverflow thread , the commit hash you want is d59db4e part from the version name, without leading g . Download the Sources For hammerhead, the kernel sources lie in msm tree. After cloning it into kernel directory, checkout the commit hash you found in above step. $ git clone https://android.googlesource.com/kernel/msm.git kernel $ cd kernel $ git checkout d59db4e Adapt kernel/AndroidKernel.mk Two changes need to be made for kernel to be successfully built in-tree. Use zImage-dtb instead of zImage as target. First, change TARGET_PREBUILT_INT_KERNEL (~line 8). -TARGET_PREBUILT_INT_KERNEL := $(KERNEL_OUT)/arch/arm/boot/zImage +TARGET_PREBUILT_INT_KERNEL := $(KERNEL_OUT)/arch/arm/boot/zImage-dtb Then change corresponding make rule (~line 47). $(TARGET_PREBUILT_INT_KERNEL): $(KERNEL_OUT) $(KERNEL_CONFIG) $(KERNEL_HEADERS_INSTALL) - $(MAKE) -C kernel O=../$(KERNEL_OUT) ARCH=arm CROSS_COMPILE=arm-eabi- + $(MAKE) -C kernel O=../$(KERNEL_OUT) ARCH=arm CROSS_COMPILE=arm-eabi- zImage-dtb Do not build modules (~line 48-51). - $(MAKE) -C kernel O=../$(KERNEL_OUT) ARCH=arm CROSS_COMPILE=arm-eabi- modules - $(MAKE) -C kernel O=../$(KERNEL_OUT) INSTALL_MOD_PATH=../../$(KERNEL_MODULES_INSTALL) INSTALL_MOD_STRIP=1 ARCH=arm CROSS_COMPILE=arm-eabi- modules_install - $(mv-modules) - $(clean-module-folder) Adapt device/lge/hammerhead Project Next we need to tell the device to build kernel, instead of copying the pre-built one. This patch should do the trick. Basically, a new AndroidBoard.mk file is added to include the rules to build and copy kernel. And some lines in device.mk related to kernel are removed, since it's already taken care of in AndroidBoard.mk . Build It! After all above changes, do a make clobber to make sure we have a clean slate , otherwise, some strange errors may strike you. Then just build AOSP in normal way and kernel should get built on the fly. Here is a snapshot of the kernel version I built. The version name is no longer d59db4e because I made some changes. Credits Thanks to this blog from Jameson for describing most of it. UPDATE The above setup works fine as long as you didn't specify a separate output directory , since we assume the kernel output directory is ../$(KERNEL_OUT) in make options. Apparently, it will fail if the out directory is not the default one. The kernel Makefile support two ways of specify output directory (see comments starting from line 79). One is to use O= command line option, another is to set the KBUILD_OUTPUT environment variable. Since we use -C option to first switch working directory, O= options is a bit tricky to use, so we leverage the KBUILD_OUT variable. We first figure out the absolute path of the KERNEL_OUT FULL_KERNEL_OUT := $( shell readlink -e $( KERNEL_OUT )) Then we set KBUILD_OUT before calling make : $(KERNEL_CONFIG) : $( KERNEL_OUT ) env KBUILD_OUTPUT = $( FULL_KERNEL_OUT ) \\ $( MAKE ) -C kernel ARCH = arm CROSS_COMPILE = arm-eabi- $( KERNEL_DEFCONFIG ) This way will work no matter where the actual AOSP output directory is. UPDATE (09/03/2015) As Ryan pointed out , for Mac users, you may need to install GNU readlink , instead of the built-in one.","tags":"android","title":"Build Kernel In Tree with AOSP for Nexus 5 Hammerhead"},{"url":"http://jhshi.me/2014/06/27/fix-data-connection-for-nexus-5-hammerhead-on-android-4-dot-4-4-kitkat/index.html","text":"Recently, I need to build a working ROM for Nexus 5 from LG (hammerhead, here-forth). There are variety of tutorials and guide all over the web on the general steps needed to compile AOSP from scratch, which I do not intend to repeat here. Instead, I'll mostly focus on how to make the data connection (3G/LTE) working on Sprint phones. I choose the latest AOSP release as of writing this post, android-4.4.4_r1 as per the official Android build numbers page , and followed the official build instructions from Android . Everything went smoothly, except that after flashing to device, I found there was no data connection (3G/LTE). Of course Google apps were also missing but it should be easy to fix. After banging my head for a while, I came across this post from Jameson and this thread , which shed some light on what's happening. Apparently, the vendor binaries from Google's driver page do not work properly out of the box. Some was missing, such as OmaDmclient.apk , and others were different from those in factory image. So based on Jameson's vendor binary repos ( lge , qcom ), I updated them with the binaries from factory image of Android 4.4.4 (KTU84P). Yet still no luck. Finally, one of the comments in that post led me to this xda thread talking about APN fixes for Sprint users, which seems to be just I missed. So I used the apns-conf.xml file from there and va-la, LTE is working! One tiny glitch though, on first boot, activating data connection took far longer than it should be, so once you saw the LTE icon, it's safe to hit skip. UPDATE (Jun 28, 2014) To be able to sign the added vendor apks properly, I've added corresponding Android.mk in each proprietary directory. Also, TimeService.apk from qcom should override the one from gapps. UPDATE (Oct 3, 2015) To extract the files from factory image: Download the factory image, uncompress it. Unzip the image files ( images-hammerhead-xxxxxx.zip ) inside it. Uncompress the system.img file: $ simg2img system.img system.img.raw Mount the system image (assuming you already have a mount point). $ mount -t ext4 -o loop system.img.raw /mnt/system.img/ More instructions can be found here .","tags":"android","title":"Fix Data Connection for Nexus 5 Hammerhead on Android 4.4.4 Kitkat"},{"url":"http://jhshi.me/2014/06/18/performance-tips-about-django-orm/index.html","text":"Django provides an friendly Object Relational Mapping (ORM) framework. In several of my data analysis projects, I used Django ORM to process millions of logcat data generated by hundreds of Android phones. Here are some of the experiences and tips that helps making the processing just a bit faster. DEBUG Flag First of all, set DEBUG to False in settings.py . With DEBUG as True , Django will keep in memory the DB queries it has run so far, which lead to memory leak if you have a large batch of importing work. Control Transaction Manually By default, Django will wrap each database operation a separate transaction, and commit them automatically. Accessing database frequently definitely will slow you down, especially when all you want to do is just to insert (a large amount of) data. Django's transaction module provides several functions to let you control when to commit the transaction. My favorite one is to use transaction.commit_on_success to wrap the function that import data for a individual device. An addition benefit is, now you know the data importing for each device either finished completely, or didn't get imported at all. So if something wrong happens during the importing, or you have to stop it in the middle for some reason. Next time when you rerun the importing, you won't get duplicate rows! Bulk Create Rows When you have lots of data that you want to import into the database, instead of call each objects save function individually, you can store them in a list and use the object manager's bulk_create function. It'll insert the list of objects into the database \"in an efficient manner\". Use this technique together with the transaction.commite_on_success mentioned above, the data importing should be fast enough. Iterator Now all the raw data is imported into database, the next thing you want to do is probably run second pass of processing, filtering, or whatever. When the data size is large, it's unlikely that you need to use them again and again. Most of the time, you just want to iterate through each log line, get some statistical information, or some simple computation. So after you construct your (crazy) query set, you want to add an .iterator() function after it, so Django knows you just want to iterate the data once, and will not bother to cache them. Otherwise, Django will cache the query results, and soon you will find your system freezes, and the kernel does nothing but swapping... Reset Queries And Garbage Collection Every now and then you can also reset Django queries manually with the reset_queries function, and trigger garbage collection using gc.collect() . They'll help you to further reduce memory usage. Resources Database access optimization","tags":"Django","title":"Performance Tips about Django ORM"},{"url":"http://jhshi.me/2014/04/25/how-android-wifi-state-machine-works/index.html","text":"Recently, I studied how Android Wi-Fi subsystem works. I was more specifically interested to learn the scan behavior. The source code related to this is mainly in framework/base/wifi/java/android/net/wifi/ within AOSP source tree. The Big Picture Android uses a customized wpa_supplicant to perform AP authentication and association, and also communicate with underlying driver. The WifiNative class is used to send various commands to wpa_supplicant ,and the WifiMonitor class is used to monitor wpa_supplicant status change and notify Android framework. wpa_supplicant communicates with underlying driver using new CFG80211/NL80211 interface. Basics of Hierarchical State Machine Android framework uses a Hierarchical State Machine (HSM) to maintain the different states of Wi-Fi connection. As the name indicates, all states are organized in a tree, and there is one special initial state. The interface of each state is as follows: enter() : called when entering this state. exit() : called when exiting this state. processMessage() : called when message arrives. The most import property of HSM is that when transitioning between states, we first found the common ancestor state that's closest to current state, the we exit from current state and all its ancestor state up to but not include the closest common ancestor, then enter all of the new states below the closet common ancestor down to the new state. Here is a simple example HSM. S4 is the initial state. When we first start the HSM, S0.enter() , S1.enter() and S4.enter() will be called in sequence. Suppose we want to transit from S0 to S7 , since the closet common ancestor is S0 , S4.exit() S1.exit() , S2.enter() , S5.enter() and S7.enter() will be called in sequence. More details about HSM can be found in the comments of frameworks/base/core/java/com/android/internal/util/StateMachin.java . Wifi State Machine Here is a subset of the whole Android Wifi HSM, states about P2P connections are omitted. So in Initial state's enter() function, we check if the driver is loaded, and transit to Driver Loaded state if yes. Then we start wpa_supplicant and transit When we receive SUP_CONNECTED_EVENT , we switch to Driver Started state. But before that, we need to first enter Supplicant Started state first. In the enter() function of Supplicant Started state, we set the supplicant scan interval, which, by default, is 15 seconds defined in frameworks/base/core/res/res/values/config.xml as config_wifi_supplicant_scan_interval . So the first fact of Android scan behavior is that it'll do scan every 15 seconds as long as the wpa_supplicant is started, no matter what the Wi-Fi condition is. Then we come to Driver Started State, if we're not in scan mode, then we switch to Disconnected Mode state. Scan mode means that Wi-Fi will be kept active, but the only operation that will be supported is initiation of scans, and the subsequent reporting of scan results. No attempts will be made to automatically connect to remembered access points, nor will periodic scans be automatically performed looking for remembered access points. In Disconnected Mode state's enter() function, if the chipset does not support background scan, then we enable framework periodic scan. The default interval is 300 seconds (5 mins), defined in framworks/base/core/res/res/values/config.xml as config_wifi_framework_scan_interval . So the second behavior of Android scan is that, in disconnected mode, it'll issue scan every 5 mins. Then if received NETWORK_CONNECTION_EVENT event from WifiMonitor , we switch to Obtaining IP state, which will initiate the DHCP process if needed. Then we go through Veifying Link and Captive Portal Check state, and finally reach Connected state. WifiWatchdogStateMachine will continuously monitor the link quality and packet loss event, and will send out POOR_LINK_DETECTED or GOOD_LINK_DETECTED event. Android Scan Interval Here is the statistics of scan interval distribution collected on 129 Nexus S phones for about 5 months. We can see that there are 4 peaks in the distribution. The peak around 15 seconds is due to wpa_supplicant scan interval, and the peak around 300 is due to framework periodic scan. The peak around 60 seconds is not much clear yet, probably due to the scan interval when P2P is connected. The interesting fact is actually the peak within 2 seconds. It seems most of the scan results are clustered together in a small time windows (1~2 seconds). This is because when the driver is scanning, it'll report every time it detects one AP. So in one scan, multiple scan result event will be triggered. And every time when there is a low level scan result event, Android will report the complete updated scan result list.","tags":"research","title":"How Android Wifi State Machine Works"},{"url":"http://jhshi.me/2014/04/08/os161-debug-tips/index.html","text":"In doing OS161 assignments, if you don't know how to use GDB and how to use it efficiently, you're screwed, and will die very ugly. It's a very important skill to use GDB to identify what's wrong with your code. That's the first step towards to fix the bug. db: Connect to sys161 with less key strokes The canonical way in GDB to connect to sys161 is using this command: target remote unix:.sockets/gdb You really don't want to type that every time you restart sys161. You may wondered: there got be a better way to do this . YES, there is. Create a file named .gdbinit inside your ~/root directory, or wherever you launch GDB. In that file, put these code: def db target remote unix:.sockets/gdb end Then in GDB, a simple db command will connect GDB to sys161. How it works? Well, we defined a custom command called db , which does the dirty work. When GDB starts, it'll read the file named .gdbinit in current working directory if it exists. So GDB will recognize the db command and know what to do when we type db . backtrace: WTF just happened? Have you ever seen the kernel panic out of nowhere and you got no clue what just happened? One of the purposes of the panic function is to provide an universal endpoint of all kinds of messy errors. So when your kernel does panic, you know where to back trace the bug. So whenever your kernel panics, you don't panic. Just set a breakpoint at the panic function and do a backtrace when your kernel got there. You'll find out exactly which line of code trigger the panic. Then you can fix it. until: Jump out of the loop Have you ever try to jump out of a loop and just want to see the suspicious part after a loop? For example, you use a for loop to initialize the file table, or process table, or whatever table. And you're pretty sure the loop is OK. But when you step into that function, you may need to hopelessly press next N times to pass the loop. Of course there is a better way to do this! You can use the until command of GDB, which, as per GDB help message, \"execute until the program reaches a source line greater than the current or a specified location (same args as break command) within the current frame.\" Basically, it'll set a one-time breakpoint at the line you specified, and execute until the CPU reach that line of code. finish: Get the hell out of here In short, this command will let GDB keep executing until current function returns. It's useful when you accidentally step into a function which you know works well. Or at the end of the function is a for loop which you're sure is OK. display: Show me this, period. You may know how to use print command to print out variable values to make sure everything is as expected. But there're some variables you want to examine every time you hit a break point. For example, you may want to show the process's pid whenever you hit sys_fork or sys_waitpid . So, instead of type the print command every time, you can use the display command. Basically, the usage is the same with print , just that every time you hit a breakpoint, GDB will display the variable's value. condition: Only stop here if... So you know how to set breakpoint, but some times you only want to hit that breakpoint when certain things happens. For example, when you debug sys_lseek using /testbin/fileonlytest , you may want to also check your sys_write as well, because it also updates the file handle offset. But if you set a break point at sys_write , you'll hit it every time the user program print something, i.e., write stdout, which is not very interesting, and kind of annoying because you don't really care about it. The solution is to use condition command. Basically it allows you to set a conditional breakpoint so GDB will only stop at the breakpoint if the condition is true. For example, I only want to step into sys_write when the fd is 3. I can do this: If you have any other GDB tricks that you think is really awesome, welcome to comment below and I'd be happy to include them here.","tags":"os161","title":"OS161 Debug Tips"},{"url":"http://jhshi.me/2014/04/02/get-package-usage-statistics-in-android/index.html","text":"In developing PhoneLab Conductor, I need to get various statistics about a installed package to determine if a app is actively used by participant. For example, for interactive apps, I'd like to know how many times the user launches the app, and how long user actively interact with the app. For background apps (e.g., data collection), I'd like to know how long the background service has been running. There is this dumpsys tool in Android which will provide various information about the status of the system, including package statistics. Here is the sample output. root@android:/ # dumpsys usagestats Date: 20140402 com.android.launcher: 2 times , 43748 ms com.android.launcher2.Launcher: 2 starts com.tencent.mm: 1 times , 167750 ms com.tencent.mm.ui.chatting.ChattingUI: 4 starts, 1000-1500ms = 2, 4000-5000ms = 1 com.tencent.mm.ui.tools.ImageGalleryUI: 1 starts, 250-500ms = 1 com.tencent.mm.ui.LauncherUI: 4 starts, 2000-3000ms = 1 com.tencent.mm.ui.friend.FMessageConversationUI: 1 starts, 250-500ms = 1 com.android.settings: 2 times , 93065 ms com.android.settings.Settings: 2 starts com.android.settings.SubSettings: 2 starts, 250-500ms = 1, 500-750ms = 2 com.google.android.gm: 1 times , 11396 ms com.google.android.gm.ConversationListActivityGmail: 1 starts, 500-750ms = 1 At first glance, this is a perfect fit for my purpose. But there're two problems. This command needs to be run in shell. How can I get these information programatically using Java code? I definitely don't want to execute this shell command and then parse its output. Only interactive apps' statistics are included. What about background apps which may don't have an activity? IUsageStats Service After poking around Android Settings app's source code, I found there is one internal interface called IUsageStats . It's defined in framework/base/core/java/com/android/internal/app/IUsageStats.aidl inside AOSP tree. You can find it here package com.android.internal.app ; import android.content.ComponentName ; import com.android.internal.os.PkgUsageStats ; interface IUsageStats { void noteResumeComponent ( in ComponentName componentName ); void notePauseComponent ( in ComponentName componentName ); void noteLaunchTime ( in ComponentName componentName , int millis ); PkgUsageStats getPkgUsageStats ( in ComponentName componentName ); PkgUsageStats [] getAllPkgUsageStats (); } Where PkgUsageStats class is defined in framework/base/core/java/com/android/internal/os/PkgUsageStats.java link . public class PkgUsageStats implements Parcelable { public String packageName ; public int launchCount ; public long usageTime ; public Map < String , Long > componentResumeTimes ; // other stuff... } It contains all the information I need about foreground apps! Now is the problem of how to access the internal class and interface of Android. There's plenty way to do this. Since I have aosp source tree at hand, I just copy those two files into my project. For PkgUsageStats , I also need to copy the aidl file ( framework/base/core/java/com/android/internal/os/PkgUsageStats.aidl ). Here is the final directory structure of my src folder. src/ |-- com | `-- android | `-- internal | |-- app | | `-- IUsageStats.aidl | `-- os | |-- PkgUsageStats.aidl | `-- PkgUsageStats.java `-- other stuff Here is the code snippet that get the IUsageStats service. private static final String SERVICE_MANAGER_CLASS = \"android.os.ServiceManager\" ; try { Class serviceManager = Class . forName ( SERVICE_MANAGER_CLASS ); Method getService = serviceManager . getDeclaredMethod ( \"getService\" , new Class []{ String . class }); mUsageStatsService = IUsageStats . Stub . asInterface (( IBinder ) getService . invoke ( null , \"usagestats\" )); } catch ( Exception e ) { Log . e ( TAG , \"Failed to get service manager class: \" + e ); mUsageStatsService = null ; } Here I use Java reflection to get the class of android.os.ServiceManager , which is also internal interface. After that, you just get all the package statistics like so: HashMap < String , PkgUsageStats > stats = new HashMap < String , PkgUsageStats >(); if ( mUsageStatsService != null ) { try { PkgUsageStats [] pkgUsageStats = mUsageStatsService . getAllPkgUsageStats (); for ( PkgUsageStats s : pkgUsageStats ) { stats . put ( s . packageName , s ); } } catch ( Exception e ) { Log . e ( TAG , \"Failed to get package usage stats: \" + e ); } } Background Service Running Time It seems that Settings->Apps->Running Apps are already showing the information that how long a process or service has been running. After inspecting the source code of Settings app, I found that information is coming from ActivityManager.RunningServiceInfo . There is a field named activeSince , which is the time when the service was first made active. === UPDATE === It seems the reflection need system permission (I haven't tested yet). Since we build our own platform, it's not a problem--we just sign our apk with the platform key and declare our app's user as system in AndroidManifest.xml . android:sharedUserId=\"android.uid.system\" But if you don't have the platform key, then this approach probably won't work. The other way I can think of is you run the dumpsys command and parse the output, but it still requites root permission.","tags":"Android","title":"Get Package Usage Statistics in Android"},{"url":"http://jhshi.me/2014/03/21/switch-channel-without-breaking-tcp-connection-in-openwrt/index.html","text":"Recently, I've been working on dynamic channel selection based on channel utilization. One problem I encountered is: how to switch both AP and devices' channel without interrupting existing TCP connection. First Intuitive Solution I have a router ( TP-LINK TL-WDR3500 ) running OpenWrt . Wireless configurations, e.g., SSID, channel, tx power, are managed in Openwrt's UCI system. More specifically, all Wifi configurations are stored in file located in /etc/config/wireless . In my case, the file looks like this: config wifi-device 'radio0' option type 'mac80211' option hwmode '11g' option path 'platform/ar934x_wmac' option htmode 'HT20' list ht_capab 'LDPC' list ht_capab 'SHORT-GI-20' list ht_capab 'SHORT-GI-40' list ht_capab 'TX-STBC' list ht_capab 'RX-STBC1' list ht_capab 'DSSS_CCK-40' option txpower '27' option channel '11' config wifi-iface option device 'radio0' option network 'lan' option mode 'ap' option ssid 'PocketSniffer' option encryption 'psk2' option key 'XXXX' OpenWrt provides a command called wifi , that can reload these configurations. So my first solution is to uci command to change the configuration and use wifi command to reload them. def set_channel ( channel ) : args = [ 'uci' , 'set' ] if channel <= 11 : args . append ( 'wireless.radio0.channel=' + str ( channel )) else : args . append ( 'wireless.radio1.channel=' + str ( channel )) subprocess . call ( args ) subprocess . call ([ 'uci' , 'commit' ]) subprocess . call ([ 'wifi' ]) This will work in the sense that it can change the AP's channel. But the problem is, the wifi command will actually shut down the interface completely and restart it. So any devices that connected to this AP will be de-associated. What's the Problem? From client's side of view, when the AP switches to another channel, here is what happend: Receive de-authentication frame from AP (ops, this AP is gone) Do active scan on every channel (probe-wait) Figure out a best AP to associate Send authentication and association request to newly selected AP This is much like a typical handover process where a device switches between two geographically co-located APs. Just that in this case, the two APs are actually the same physical AP with different channel. A. Mishra et al provides a thorough study on the handover process. In short, the process can take up to a few hundred milliseconds, and any on-going TCP connections will lost. This is undesired because the channel switch cost (extra latency and breaking TCP connection) may neutralize the benefit of switching channel itself. Ideally, after channel switch, any authentication info at AP side should remain, so that clients don't have to re-authenticate, and any established TCP connection should also be kept. These requirements make sense because, after all, channel is just medium to exchange data. Channel switch should NOT affect any up layer state. The Final Solution After a bit research, I found that IEEE 802.11 standard (section 10.9.8 in 2012 standard) actually already defined the mechanism to let AP announce the channel switch event and also let clients switch channel accordingly - all happened in MAC layer. This feature quite fits our needs. And the good new is that this feature has already been implemented in most recent driver that adopting CFG80211 interface, and is exposed to user space tools, such as hostapd or wpa_supplicant. The OpenWrt running on our router use hostapd as user space authenticator. And it provides a command line tool called hostapd_cli to interact with the hostapd daemon. There is a command in hostapd_cli called chan_swtich that does precisely what we wanted. def set_channel ( channel ) : # do not use the wifi command to switch channel, but still maintain the # channel coheraence of the configuration file args = [ 'uci' , 'set' ] if channel <= 11 : args . append ( 'wireless.radio0.channel=' + str ( channel )) else : args . append ( 'wireless.radio1.channel=' + str ( channel )) subprocess . call ( args ) subprocess . call ([ 'uci' , 'commit' ]) # this is the command that actually switches channel with open ( os . devnull , 'wb' ) as f : cmd = 'chan_switch 1 ' + str ( channel2freq ( channel )) + ' \\n ' p = subprocess . Popen ( 'hostapd_cli' , stdin = subprocess . PIPE , stdout = f , stderr = f ) p . stdin . write ( cmd ) time . sleep ( 3 ) p . kill () Here we still update the configuration file to maintain consistence between it and the hostapd daemon. But instead of using wifi command to reload the configuration, we use the chan_swtich command to change the channel. chan_switch takes a minimum of two arguments. The first is a cs_count , meaning switch channel after how many beacon frames. The second is frequency. More usage info can be obtained by typing chan_switch without any arguments in hostapd_cli .","tags":"research","title":"Switch Channel Without Breaking TCP Connection in OpenWrt"},{"url":"http://jhshi.me/2014/03/06/os161-unknown-syscall-1/index.html","text":"When working on OS161 system calls, you'll probably see a bunch of this error, especially you haven't implemented _exit syscall and try to do some basic user programs, e.g., p /bin/true . Note, this problem has been fixed in OS/161 version 1.99.07. The code for /bin/true is as follows. int main () { /* Just exit with success. */ exit ( 0 ); } It does nothing but just exit with 0. Because at this point, you may don't have exit syscall implemented, so it'll fail, so you'll see one error message saying \"Unknown syscall 3\", in which 3 is just SYS__exit . Then what happens? Why are there a bunch of \"Unknown syscall -1\" following that? To understand this, you need to know about a bit of GCC optimization and also several MIPS instructions , especially jal and jr . MIPS Function Call and Return Here is the MIPS assembly instruction that \"calls\" a function foo . jal foo jal stands for \"Jump And Link\", it will first save $epc+8 into register $ra (return address), and set $epc to whatever address foo are, to \"jump\" to that function. Now you may wonder why $ra is $epc+8 , since a natural next instruction would be $epc+4 . That's because $epc+4 is in jal 's delay slot , which means the instruction will get executed before the jal instruction. So the real next instruction after the function call should be $epc+8 . And when foo is done and about to return, it just does this: jr ra jr stands for \"Jump Register\". It just set $epc to whatever value in that register. In this case, since $ra contains the value of return address, the foo functions \"returns\" to the next instruction after jal in callee. GCC Optimization As per the comments in $OS161_SRC/user/lib/libc/stdlib/exit.c , GCC is way too smart to know, without being explicitly told, that exit doesn't return. So it actually omit the jr instruction at the end of exit . That is, if exit does return, the CPU will continue to execute whatever the following instructions. What really happened? Here is the assembly code of /bin/true . You can obtain it by doing this in the root directory: $ os161-objdump -d bin/true > true.S 00400100 <main>: 400100: 27bdffe8 addiu sp,sp,-24 400104: afbf0010 sw ra,16(sp) 400108: 0c10004d jal 400134 <exit> 40010c: 00002021 move a0,zero 00400110 <__exit_hack>: 400110: 27bdfff8 addiu sp,sp,-8 400114: 24020001 li v0,1 400118: afa20000 sw v0,0(sp) 40011c: 8fa20000 lw v0,0(sp) 400120: 00000000 nop 400124: 1440fffd bnez v0,40011c <__exit_hack+0xc> 400128: 00000000 nop 40012c: 03e00008 jr ra 400130: 27bd0008 addiu sp,sp,8 00400134 <exit>: 400134: 27bdffe8 addiu sp,sp,-24 400138: afbf0010 sw ra,16(sp) 40013c: 0c100063 jal 40018c <_exit> 400140: 00000000 nop ... 00400150 <__syscall>: 400150: 0000000c syscall 400154: 10e00005 beqz a3,40016c <__syscall+0x1c> 400158: 00000000 nop 40015c: 3c010044 lui at,0x44 400160: ac220430 sw v0,1072(at) 400164: 2403ffff li v1,-1 400168: 2402ffff li v0,-1 40016c: 03e00008 jr ra 400170: 00000000 nop So main calls exit (0x400108), exit calls _exit (0x40013c). Note that at this point, $ra=$epc+8=0x400144 . _exit fails (because we haven't implemented it yet), $v0 is set to -1, and returns to $ra . The memory between 0x400140 and 0x400150 are filled by 0, which is nop instruction in MIPS. So the CPU get all the way down to the __syscall function at 0x400150, and execute the syscall instruction. At this point, the value of $v0 is -1. That's why we see the first Unknown syscall -1 error message. And after the syscall fails, the CPU will continue execution at 0x400154, and finally do jr ra (0x40016c). Since $ra is still 0x400144, the whole process repeats again. That's why you keep seeing Unknown syscall -1 error. How to fix? The problem is, GCC assumes exit does not return, thus doesn't generate the jr ra instruction for exit . But before we implement _exit syscall, exit does return. Then we lose control and things get messy. Then how to fix this? Well, the easiest way to fix this is...implement _exit , of course. After all, that's what you suppose to do in ASST2 anyway. In terms of the problem itself, the latest version of OS/161 (1.99.07) has fixed this. Here is how: void exit ( int code ) { /* * In a more complicated libc, this would call functions registered * with atexit() before calling the syscall to actually exit. */ #ifdef __mips__ /* * Because gcc knows that _exit doesn't return, if we call it * directly it will drop any code that follows it. This means * that if _exit *does* return, as happens before it's * implemented, undefined and usually weird behavior ensues. * * As a hack (this is quite gross) do the call by hand in an * asm block. Then gcc doesn't know what it is, and won't * optimize the following code out, and we can make sure * that exit() at least really does not return. * * This asm block violates gcc's asm rules by destroying a * register it doesn't declare ($4, which is a0) but this * hopefully doesn't matter as the only local it can lose * track of is \"code\" and we don't use it afterwards. */ __asm volatile ( \"jal _exit;\" /* call _exit */ \"move $4, %0\" /* put code in a0 (delay slot) */ : /* no outputs */ : \"r\" ( code )); /* code is an input */ /* * Ok, exiting doesn't work; see if we can get our process * killed by making an illegal memory access. Use a magic * number address so the symptoms are recognizable and * unlikely to occur by accident otherwise. */ __asm volatile ( \"li $2, 0xeeeee00f;\" /* load magic addr into v0 */ \"lw $2, 0($2)\" /* fetch from it */ :: ); /* no args */ #else _exit ( code ); #endif /* * We can't return; so if we can't exit, the only other choice * is to loop. */ while ( 1 ) { } } So if _exit returns for any reason, we just access an address we know is invalid, thus trigger an exception, and the kernel just panics.","tags":"os161","title":"OS161: Unknown syscall -1"},{"url":"http://jhshi.me/2014/01/06/tap-notification-to-send-email/index.html","text":"In developing PhoneLab Conductor app, I need to provide user a way to give us feedback after applying OTA update. Although this feature was disabled in release, I thought it's worthwhile to record how to implement that functionality anyway. The Scenario After the phone received and OTA update and rebooted to apply it, the conductor app will pop up an notification, saying something like \"You've updated your platform, if there's any question, please tap to email for help.\". So when user tap the notification, a selection alert should pop up to let user select which email client to use. Then open that email client with proper recipient, subject, and email body (e.g., some extra debug information). The Overall Flow When we post an notification using Notification.Builder , we can optionally set an PendingIntent about what action to take when user tap that notification. This is done via the setContentIntent function. builder . setContentIntent ( reportProblemPendingIntent ); notificationManager . notify ( PLATFORM_UPDATE_NOTIFICATION_ID , builder . build ()); And that PendingIntent will broadcast an custom intent so our BoradcastReceiver will be called and handle that tap event. private String reportProblemIntentName = this . getClass (). getName () + \".ReportProblem\" ; private IntentFilter reportProblemIntentFilter = new IntentFilter ( reportProblemIntentName ); private PendingIntent reportProblemPendingIntent ; private BroadcastReceiver reportProblemReceiver = new BroadcastReceiver () { @Override public void onReceive ( Context context , Intent intent ) { // to be filled } }; // in initilization function reportProblemPendingIntent = PendingIntent . getBroadcast ( context , 0 , new Intent ( reportProblemIntentName ), PendingIntent . FLAG_UPDATE_CURRENT ); context . registerReceiver ( reportProblemReceiver , reportProblemIntentFilter ); Launch Email App Now when user tap the notification, the onReceive handler will be called. First, we need to cancel the notification. NotificationManager notificationManager = ( NotificationManager ) context . getSystemService ( Context . NOTIFICATION_SERVICE ); notificationManager . cancel ( PLATFORM_UPDATE_NOTIFICATION_ID ); Then we prepare the intent for launch email app. Intent emailIntent = new Intent ( Intent . ACTION_SENDTO ); emailIntent . setType ( \"text/plain\" ); String messageBody = \"========================\\n\" + \" Optional debug info \\n\" + \"========================\\n\" + \"Please describe your problems here.\\n\\n\" ; String uriText = \"mailto:\" + Uri . encode ( PHONELAB_HELP_EMAIL ) + \"?subject=\" + Uri . encode ( \"OTA Update Problem\" ) + \"&body=\" + Uri . encode ( messageBody ); emailIntent . setData ( Uri . parse ( uriText )); Note that in order to actually launch the email chooser, we need another intent. Intent actualIntent = Intent . createChooser ( emailIntent , \"Send email to PhoneLab\" ); actualIntent . addFlags ( Intent . FLAG_ACTIVITY_NEW_TASK ); context . startActivity ( actualIntent );","tags":"Android","title":"Tap Notification To Send Email"},{"url":"http://jhshi.me/2013/12/17/sum-of-n-largest-numbers-in-google-spreadsheet/index.html","text":"I encountered this problem when trying to get the final grades for an course I TAed for this semester. There were 10 homework assignments throughout the semester, and we're supposed to only count the 8 highest grades. So, how to accomplish this in Google Spreadsheet? First Try After poking around Google search results a little bit, I found this solution, which seems to work. =ceiling(sum(filter(E2:N2,E2:N2>=large(E2:N2, 8)))/8,1) Where E2:N2 contains the 10 grades. The large function will return the 8th highest grade of the 10, and then we only sum the grades that larger than or equal to that grade. This seems all fine until I accidentally found that some students got more than 100 pts, which is impossible because all our grades are 100 based! The Problem Well, what's wrong with the previous formula? Suppose a student's 10 grades look like this 94 97 92 94 98 100 100 100 100 100 Sort them in descending order 1 2 3 4 5 6 7 8 9 10 100 100 100 100 100 98 97 94 94 92 So in this case the 8th largest number is 94, yet there are two 94, and we really just need one of them. The Solution After struggling in Google Spreadsheet function list , I found that we can actually do SQL-like queries within the spreadsheet! This leads to the final solution. =ceiling(sum(query(sort(transpose(E2:N2), 1, FALSE), \"select * limit 8\"))/8,1) Here we first transpose the row data into column, then sort them in descending order, then we just take the first 8 grades when calculating the average.","tags":"tricks","title":"Sum of N Largest Numbers in Google Spreadsheet"},{"url":"http://jhshi.me/2013/12/15/os161-tool-chain-setup/index.html","text":"This post shows how to install os161 toolchain, including bmake , sys161 , etc. on your local machine. Why Even Bother? Some instructors [setup the environment on public machines][canada] that students can share; some distribute the whole os161 develop environment in a VM appliance , in which the tool chain is already set up for you. In both cases, students can start working on the OS itself immediately, instead of taking down by the tool chain setting up process and loss confidence even before starting. However, I think it's still beneficial that we setup the tool chain on our local machine by ourselves: Virtual Machines typically suffer from performance degradation, especially when your machine is not that high-end (4 or 8 cores, 8 or 16 Gig RAM, etc.). And most people experienced video driver issues after accidentally upgrade the guest VM. The setting up process can help us understand at least how each tools interact. The cross-compiling experience could potentially useful in future projects/assignments. You can gain some confidence if you can set up the tool chain successfully. And confidence is the key to survive later assignments. The following instructions are tested under Ubuntu 13.10 x86_64 with gcc version 4.8.1, they should, however, also work on other distros. Directory Setup Suppose you want to place the os161 related stuff in ~/projects/courses/os161 , then you would have to set up the directory structure like this. mkdir -p ~/projects/courses/os161 mkdir -p ~/projects/courses/os161/toolbuild mkdir -p ~/projects/courses/os161/tools/bin Eventually the os161 directory will be the top level directory for all our os161 stuff. And toolbuild will contain all the downloaded and extracted packages, and tools will contain all the os161 environments, like the compiler, debugger, simulator, etc. To simplify further steps, we set up a few environment variables. export PREFIX = ~/projects/courses/os161/tools export BUILD = ~/projects/courses/os161/toolbuild export PATH = $PATH : $PREFIX /bin Of course you can install os161 tool chain anywhere you like, just make sure the directory structure is right. Note that: In the whole process of doing this, you don't need to touch any file outside our os161 directory (unless explicitly stated). So if you must use sudo to copy some stuff, then you probably typed something wrong. If you choose to install the tool chain somewhere else, you need to adjust the variables accordingly. The environment variables (e.g., PREFIX , BUILD ) are only valid in current session , so in case you want to take a break(e.g., play guitar) during the process, make sure you still have those variables. You can do that by do echo $PREFIX , make sure it's ~/projects/courses/os161/tools . If they disappear somehow, just redo the export commands. Download And Extract the Packages You can download all the required packages in this page . As of writing this post, the latest packages are: binutils-2.17+os161-2.0.1.tar.gz gcc-4.1.2+os161-2.0.tar.gz gdb-6.6+os161-2.0.tar.gz bmake-20101215.tar.gz mk-20100612.tar.gz sys161-1.99.06.tar.gz Download the above packages and put them in the toolbuild directory we just created. Extract the packages as follows: cd $BUILD tar xvf binutils-2.17+os161-2.0.1.tar.gz tar xvf gcc-4.1.2+os161-2.0.tar.gz tar xvf gdb-6.6+os161-2.0.tar.gz tar xvf sys161-1.99.0.tar.gz tar xvf bmake.tar.gz cd bmake tar xvf ../mk.tar.gz cd .. Note that we have to extract the mk.tar.gz package inside bmake directory . Binutils cd binutils-2.17+os161-2.0.1 ./configure --nfp --disable-werror --target = mips-harvard-os161 --prefix = $PREFIX find . -name '*.info' | xargs touch make make install cd .. Note how we set the --prefix when configure. That option is to tell the Makefile where the generated binary or library files should go. Also, we fool the make command by touching all the texinfo files to make the make think those files doesn't need to be rebuilt. Because: They really don't need to be regenerated. We don't want to rebuilt them since it's highly possible that makeinfo will yell out some annoying errors on those doc files. And we don't really care the docs... Checkpoint After this step, you should have some mips-harvard-os161-* binary files in the tools/bin directory. GCC cd gcc-4.1.2+os161-2.0 ./configure --nfp --disable-shared --disable-threads --disable-libmudflap \\ --disable-libssp --target = mips-harvard-os161 --prefix = $PREFIX make -j 8 make install cd .. Note that: The backslash in the configure command is just to tell our shell that we haven't done typing, so do not execute the command just yet. If you type the whole command in one line, you don't need backslash. make -j 8 means use 8 threads when compile. Usually this will speed up the compilation process quite a little bit. Checkpoint After this step, you should see mips-harvard-os161-gcc in the tools/bin directory. GDB cd gdb-6.6+os161-2.0 ./configure --target = mips-harvard-os161 --disable-werror --prefix = $PREFIX find . -name '*.info' | xargs touch make make install cd .. Note that: We need to --disable-werror when configure. Because later version of gcc will report warnings that older version of gcc will not. Same as binutils , we avoid rebuilding doc files here. If you see this error when do configure. configure : error : no termcap library found You probably need to install the libncurses5-dev package. sudo apt-get install libncurses5-dev Checkpoint After this step, you should see mips-harvard-os161-gdb in the tools/bin directory. SYS161 Sys161 is the simulator that our os161 will be running in. cd sys161-1.99.06 ./configure --prefix = $PREFIX mipseb make make install cd .. Checkpoint After this step, you should see sys161 , hub161 , stat161 and trace161 symlinks in the tools/bin directory. Bmake cd bmake ./boot-strap --prefix = $PREFIX At the end of boot-strap command output, you should see instructions on how to install bmake properly. In our case, it look like these: mkdir -p /home/jhshi/projects/courses/os161/tools/bin cp /home/jhshi/projects/courses/os161/toolbuild/bmake/Linux/bmake /home/jhshi/projects/courses/os161/tools/bin/bmake-20101215 rm -f /home/jhshi/projects/courses/os161/tools/bin/bmake ln -s bmake-20101215 /home/jhshi/projects/courses/os161/tools/bin/bmake mkdir -p /home/jhshi/projects/courses/os161/tools/share/man/cat1 cp /home/jhshi/projects/courses/os161/toolbuild/bmake/bmake.cat1 /home/jhshi/projects/courses/os161/tools/share/man/cat1/bmake.1 sh /home/jhshi/projects/courses/os161/toolbuild/bmake/mk/install-mk /home/jhshi/projects/courses/os161/tools/share/mk Just do the commands one by one in the order given. Checkpoint After this step, you should see bmake symlink in tools/bin directory. And a bunch of *.mk files in tools/share/mk directory. Create Symbolic Links Now if you take a look at $PREFIX/bin , you will see a list of executables named like mips-harvard-os161-* , it's convenient to give them shorter name so that we can save a few keystrokes later. cd $PREFIX /bin sh -c 'for i in mips-*; do ln -s $i os161-`echo $i | cut -d- -f4-`; done' Note that the symbol around echo $i $ cut -d- -f4- is the key that under {%key Esc %} (the same key with tilde ( ~ )). Checkpoint After this step, you should see a bunch of os161-* symlinks in tools/bin directory. PATH Setup Now we've set up all required tools to build and run os161. In the first step, we change our PATH environment variable to include the tools/bin directory. Now is the time to make it permanent so that we won't need to type export PATH=$PATH:~/projects/courses/os161/tools/bin every time we open terminal. Add this line to your .bashrc . export PATH = $PATH :~/projects/courses/os161/tools/bin Checkpoint Close current terminal and open an new one. Type this commands, and check if the output matches. which sys161 # should be something like /home/jhshi/projects/courses/os161/tools/bin which bmake # should be something like /home/jhshi/projects/courses/os161/tools/bin Configure OS161 Now let's get to real business. Obtain a copy of the os161 source tree according to your course's instruction. In this case, we'll use the one from ops-class.org . Suppose you've registered an account on ops-class.org and uploaded your public key. Then you can clone the source tree and configure as follows. cd ~/projects/courses/os161 mkdir root git clone ssh://src@src.ops-class.org/src/os161 src If you encounter errors like this. cloning into 'src' ... Permission denied ( publickey ) . fatal: Could not read from remote repository. Then you probably didn't set up your key right. Make sure you put the private key (normally id_rsa ) inside ~/.ssh/ , and copy the content of id_rsa.pub to ops-class.org . Now we have the source tree, let's move on and configure it. cd src ./configure --ostree = $HOME /projects/courses/os161/root bmake bmake install cd .. cp tools/share/examples/sys161/sys161.conf.sample root/sys161.conf Note that: We create an root directory under os161 , this will be where the compiled user space binaries, and also the compiled kernel image will go. When configure the os, we specify the --ostree argument, so that the binaries will be copied to the root directory we just created. The default location is ~/root , which is probably not what you want. We must use $HOME/projects/courses/os161/root , instead of ~/projects/courses/os161/root . Otherwise, bmake will complain. We copy the sys161 configuration example to the root directory. This configuration file is needed by sys161 - the simulator. Checkpoint Go to ~/projects/courses/os161/root , you should see some directories there, e.g., bin , hostbin , lib , man , etc. Compile and Run the Kernel cd ~/projects/courses/os161/src/kern/conf ./config ASST0 cd ../compile/ASST0 bmake depend bmake && bmake install Now let's fire up the kernel. cd ~/projects/courses/os161/root sys161 kernel Checkpoint You should see outputs like this: sys161: System/161 release 1.99.06, compiled Dec 15 2013 17:42:02 OS/161 base system version 1.99.05 Copyright ( c ) 2000, 2001, 2002, 2003, 2004, 2005, 2008, 2009 President and Fellows of Harvard College. All rights reserved. Put-your-group-name-here ' s system version 0 ( ASST0 #7) 320k physical memory available Device probe... lamebus0 ( system main bus ) emu0 at lamebus0 ltrace0 at lamebus0 ltimer0 at lamebus0 beep0 at ltimer0 rtclock0 at ltimer0 lrandom0 at lamebus0 random0 at lrandom0 lhd0 at lamebus0 lhd1 at lamebus0 lser0 at lamebus0 con0 at lser0 cpu0: MIPS r3000 OS/161 kernel [ ? for menu ] : Resources You can find more instructions on tool chain setup and os161 configuration in these pages. Installing OS/161 On Your Own Machine OS/161 Toolchain Setup Building System/161 and the OS/161 Toolchain ASST0: Introduction to OS/161","tags":"os161","title":"OS161 Tool Chain Setup"},{"url":"http://jhshi.me/2013/12/15/simulate-random-mac-protocol-in-ns2-part-iv/index.html","text":"Now we have designed the simulator , add a new MAC protocol to NS2 , and implement the Random Resend MAC protocol , the final part will be analyze the trace file to measure the performance of our new protocol. Format of the Trance Entry One line in the trace file may look like this. s 0.010830867 _70_ MAC --- 27 cbr 148 [0 46000000 8 0] ------- [70:0 0:0 32 0] [0] 0 0 | | | | | | | | | | | | | +----- Packet Size | | | | | +--------- Traffic Type | | | | +------------ Packet UID | | | +--------------------- Layer | | +------------------------- Node ID | +---------------------------------- Time +----------------------------------------- Event Type You can find more details about the trace format here . Trace Filtering In this project, we're only interested the traces that From MAC layer, and With CBR traffic type With Event Type in \"Send\"(s), \"Receive\"(r) and \"Drop\"(D) So we first filter those not-so-interesting traces out. def filter ( trace_file ) : with open ( trace_file , 'r' ) as f : raw_lines = f . read () . split ( ' \\n ' ) print ( \" %s raw traces found.\" % ( len ( raw_lines ))) traces = [] for line in raw_lines : fields = line . split ( ' ' ) if fields [ 0 ] not in [ 's' , 'r' , 'D' ] : continue if not ( fields [ 3 ] == 'MAC' ) : continue if not ( fields [ 7 ] == 'cbr' ) : continue traces . append ({ 'action' : fields [ 0 ], 'node' : fields [ 2 ], 'pid' : int ( fields [ 6 ])}) print ( \" %s filtered traces found.\" % ( len ( traces ))) return traces Delivery Probability To calculate the delivery probability, we need to know How many unique packets are sent out by all source nodes? How many unique packets received by the sink node? These two metrics can be easily obtained as following: nodes = set ( t [ 'node' ] for t in traces ) print ( \" %s nodes found.\" % ( len ( nodes ))) sent = len ( set ( t [ 'pid' ] for t in traces )) recv = len ( set ( t [ 'pid' ] for t in traces if t [ 'node' ] == SINK_NODE and t [ 'action' ] == 'r' )) print ( \"sent: %d , recv: %d , P: %.2f%% \" % ( sent , recv , float ( recv ) / sent * 100 )) Remember that we use LossMonitor as sink? Now is the time to cross-reference the results here with the ones from the stats file. The total received packets number should match. The final delivery probability w.s.t the repeat count X is somehow like this in my case (packet size is 16 bytes). Note that somehow this is not the ideal probability distribution. Please refer to this paper for theoretical analysis and also simulation results. QoMOR: A QoS-aware MAC protocol using Optimal Retransmission for Wireless Intra-Vehicular Sensor Networks","tags":"network","title":"Simulate Random MAC Protocol in NS2 (Part IV)"},{"url":"http://jhshi.me/2013/12/15/simulate-random-mac-protocol-in-ns2-part-iii/index.html","text":"Now we have the simulation script , and also added our protocol to the NS2 simulator , which is still a placeholder. Now we're going the actually implement our own random MAC protocol. Protocol Description According to the project specification, when sending out an packet, our protocol is supposed to send out X copies of the packet at random time before sending out next packet. As long as the receiver receive at least one of the X duplicates, we say this packet was successfully delivered. Protocol Parameters From the protocol description, it's obvious that we need to know: How many copies to send for one packet? I.e., the X The interval of sending packet from up layer, so that we can schedule resending before up layer pass down the next packet. So we add two class variables in mac/rmac.h int repeatTx_ ; double interval_ ; And in the constructor function of RMAC class, we need to bind the variables through TCL so that we can pass values in TCL script. bind ( \"repeatTx_\" , & repeatTx_ ); bind ( \"interval_\" , & interval_ ); TCL Object Binding We also need to let TCL runtime to know our protocol. That is, when we write this in TCL script. set val ( mac ) Mac / RMAC TCL runtime would have to know the corresponding class of the Mac/RMAC . Since we copied code from mac-simple.cc and also made the changes, this part has been done, but let's just review the code snippet that does the binding. static class RMACClass : public TclClass { public : RMACClass () : TclClass ( \"Mac/RMAC\" ) {} TclObject * create ( int , const char * const * ) { return ( new RMAC ()); } } class_rmac ; Here the Mac/RMAC string will be our protocol name. Interaction with Adjacent Layer The most important function in any NS2 MAC protocol is the recv function. It's the interface to upper (Network Layer) and also lower (Physical Layer) layers. The recv of our MAC protocol will look like this. void RMAC :: recv ( Packet * p , Handler * h ) { struct hdr_cmn * hdr = HDR_CMN ( p ); /* let RMAC::send handle the outgoing packets */ if ( hdr -> direction () == hdr_cmn :: DOWN ) { sendDown ( p , h ); } else { sendUp ( p , h ); } } Here we first get the header of the packet, and check it's directory. hdr_cmn::DOWN means this packet is from upper layer, and we need to send it out. hdr_cmn::UP means this packet is from lower layer (received packet), we need to deliver it to upper layer. Repeat Sending The key part of our MAC protocol is to repeated send multiple copies when sending out a packet. So we need to mainly modify the sendDown function. double max_delay = 0 ; // generate repeatTx_ number of random delays double * delays = new double [ repeatTx_ ]; for ( int i = 0 ; i < repeatTx_ ; i ++ ) { delays [ i ] = ( rand () % 100 ) / 100.0 * interval_ ; if ( delays [ i ] > max_delay ) { max_delay = delays [ i ]; } } // use dummy tx handler for first repeatTx_-1 packets for ( int i = 0 ; i < repeatTx_ ; i ++ ) { if ( delays [ i ] != max_delay ) { Scheduler :: instance (). schedule ( & resendHandler_ , ( Event * ) p -> copy (), delays [ i ]); } } delete delays ; waitTimer -> restart ( max_delay ); if ( rx_state_ == MAC_IDLE ) { // we're idle, so start sending now sendTimer -> restart ( max_delay + ch -> txtime ()); } else { // we're currently receiving, so schedule it after // we finish receiving sendTimer -> restart ( max_delay + ch -> txtime () + HDR_CMN ( pktRx_ ) -> txtime ()); } We first generate repeatTx_ number of delays before next interval. Except for the max_delay , which will be the last copy to send, we use the Scheduler to resend the duplicated packets, and for last packet, we just use the timer scheme of SimpleMac . Here is the resendHander_ looks like. void RMACResendHandler :: handle ( Event * p ) { mac_ -> resend (( Packet * ) p ); } void RMAC :: resend ( Packet * p ) { downtarget_ -> recv ( p , NULL ); } You can find the complete code for rmac.cc and rmac.h here .","tags":"network","title":"Simulate Random MAC Protocol in NS2 (Part III)"},{"url":"http://jhshi.me/2013/12/15/simulate-random-mac-protocol-in-ns2-part-ii/index.html","text":"In previous post , we wrote an NS2 simulation program that fits the project specification, except that we're using the standard 802.11 MAC protocol. In this post, we'll discuss how to add our own MAC protocol to NS2. Compile NS2 from Source To add an new protocol to NS2, we actually need to download the whole NS2 source tree and add some extra CPP files there, which is embarrassingly inconvenient. But for now, we have to live with it. Anyways, download the NS2 all-in-one package from here , put the tarball somewhere in your home, say ~/projects/ , then extract it. cd ~/projects tar xvf ns2-allinone-2.35.tar.gz cd ns2-allinone-2.35 ./install The install script will only generate the binaries in current directory, and will NOT actually copy them to anywhere. After the compilation is done, you'll will find the ns executable in the ns-2.35 subdirectory. Suppose you put your project files (e.g., the TCL file we wrote) in ~/projects/network/ns2 , then it's convenient to have an symbol link to the ns binary. cd ~/projects/network/ns2 ln -svf ~/projects/ns2-allinone-2.35/ns-2.35/ns myns The you'll have an symbol link called myns , which points the actual executable. Then you can run your simulation this way. myns random_mac.tcl In which the random_mac.tcl is the TCL file we wrote in last post. Add a New Mac Protocol To add a new MAC protocol, say RMAC , we need to do the following. Suppose you're in the ns2-allinone-2.35/ns-2.35 directory. Create rmac.cc and rmac.h file in the mac subdirectory, for now, just leave them empty. Edit Makefile , find the line contains mac/smac.o (around line 249), add one line like this ..... mac/mac-802_3.o mac/mac-tdma.o mac/smac.o \\ mac/rmac.o \\ ..... So now, when you do make inside the ns-2.35 directory, our source file rmac.cc and rmac.h will be compiled. Of course, at this point, there is no content at those two files, which we'll add later. Adapt the SimpleMac Protocol The NS2 source contains a simple MAC protocol called SimpleMac , which is a good start point for us to adapt. Just copy all the contents in mac/mac-simple.h to mac/rmac.h , and mac/mac-simple.cc to mac/rmac.cc . Then change Mac/Simple to Mac/RMAC line 60 of the rmac.cc file. You should be able to compile using the make command in ns-2.35 directory. If everything is OK, go back to the project directory ~/projects/network/ns2 , change the MAC protocol to Mac/RMAC (previously Mac/802.11 ), you should be able to run the simulation using myns , which points to the ns binary we just compiled.","tags":"network","title":"Simulate Random MAC Protocol in NS2 (Part II)"},{"url":"http://jhshi.me/2013/12/13/simulate-random-mac-protocol-in-ns2-part-i/index.html","text":"In this network project, we would need to: Write an simulator using TCL Add an new MAC protocol to NS2 Analyze the simulation results Let's tackle them one by one. In this post, we'll mainly focus on the simulator part. Get Familiar with TCL TCL is actually a quite simple language. It's designed for fast scripting and glue things together. You can find many tutorials online. I found this one especially clean, and straightforward. Simulator Parameters First, let's define some parameters that we'll use later. # ====================================================================== # Project parameters # ====================================================================== set val ( node_num ) 101 set val ( duration ) 10 set val ( packetsize ) 16 set val ( repeatTx ) 10 set val ( interval ) 0.02 set val ( dimx ) 50 set val ( dimy ) 50 set val ( nam_file ) \"jinghaos_pa3.nam\" set val ( trace_file ) \"jinghaos_pa3.tr\" set val ( stats_file ) \"jinghaos_pa3.stats\" set val ( node_size ) 5 # ====================================================================== # Node options # ====================================================================== set val ( chan ) Channel / WirelessChannel ; # channel type set val ( prop ) Propagation / TwoRayGround ; # radio-propagation model set val ( netif ) Phy / WirelessPhy ; # network interface type set val ( mac ) Mac / RMAC ; # MAC type #set val(mac) Mac/802_11 ;# MAC type set val ( ifq ) Queue / DropTail / PriQueue ; # interface queue type set val ( ll ) LL ; # link layer type set val ( ant ) Antenna / OmniAntenna ; # antenna model set val ( ifqlen ) 50 ; # max packet in ifq set val ( nn ) $val ( node_num ) ; # number of mobilenodes set val ( rp ) DSDV ; # routing protocol The first part is parameters from the project specification. Here we have 101 nodes (100 source node plus 1 sink node), simulation duration, packet rate, terrain size, etc. The second part is for node configuration. Here we use WirelessChannel with DSDV routing protocol. Note that for MAC protocol, we use Mac/RMAC , which stands for the random MAC protocol we'll add to NS2. Of course, at this point, we don't have our RMAC protocol yet, so you can substitute it with Mac/802_11 for the moment. Simulator Configuration We can obtain an instance of the simulator, and configure it this way. # ====================================================================== # Global variables # ====================================================================== set ns [ new Simulator ] set tracefd [ open $val ( trace_file ) w ] set nam [ open $val ( nam_file ) w ] set stats [ open $val ( stats_file ) w ] $ns namtrace-all-wireless $nam $val ( dimx ) $val ( dimy ) $ns trace - all $tracefd set topo [ new Topography ] $topo load_flatgrid $val ( dimx ) $val ( dimy ) Here we set up various global variables, including trace and stats file fd, and also the topology. The we configure the node. # # Create God # create-god $val ( nn ) #Mac/RMAC set repeatTx_ $val(repeatTx) #Mac/RMAC set interval_ $val(interval) $ns node-config \\ -adhocRouting $val ( rp ) \\ -llType $val ( ll ) \\ -macType $val ( mac ) \\ -ifqType $val ( ifq ) \\ -ifqLen $val ( ifqlen ) \\ -antType $val ( ant ) \\ -propType $val ( prop ) \\ -phyType $val ( netif ) \\ -channelType $val ( chan ) \\ -topoInstance $topo \\ -agentTrace ON \\ -routerTrace ON \\ -macTrace ON \\ -movementTrace OFF Here we first create an General Operations Director(GOD) object to track the nodes' position in the topology grid. Then we configure the nodes using the parameters we set up earlier. Note that, again at this point we don't have a RMAC protocol, so we can just comment out the two lines that configure RMAC for now. The Only Sink Node Next, we're going to create the sink node. # # The only sink node # set sink_node [ $ns node ] $sink_node random-motion 0 $sink_node set X_ [expr $val ( dimx ) / 2 ] $sink_node set Y_ [expr $val ( dimy ) / 2 ] $sink_node set Z_ 0 $ns initial_node_pos $sink_node $val ( node_size ) set sink [ new Agent / LossMonitor ] $ns attach-agent $sink_node $sink Here we place the sink node at the center of the terrain, and attach an LossMonitor to it, so that we can get the packet statistics. Although the project specification requires us to get the packet statistics from the trace file, we can use the results from LossMonitor to verify that analysis results. The Source Nodes We need to create 100 source nodes, they should scatter the whole terrain randomly, also, they should start transmission also randomly, which has two benefits: - In practice, they're highly unlikable to synchronize perfectly, so we can simulator real world better. - By starting randomly, we're minimizing the chances they have collision. So we'll have two random number generators, one for the position, and one for the starting time. # # Set up random number generator, to scatter the source nodes # set rng [ new RNG ] $rng seed 0 set xrand [ new RandomVariable / Uniform ] $xrand use-rng $rng $xrand set min_ [expr - $val ( dimx ) / 2 ] $xrand set max_ [expr $val ( dimx ) / 2 ] set yrand [ new RandomVariable / Uniform ] $yrand use-rng $rng $yrand set min_ [expr - $val ( dimy ) / 2 ] $yrand set max_ [expr $val ( dimy ) / 2 ] set trand [ new RandomVariable / Uniform ] $trand use-rng $rng $trand set min_ 0 $trand set max_ $val ( interval ) Also note that we set the seed to the Random Number Generator (RNG) to a constant value 0 , so that in each simulation we can get the same results, easy for debug and also analyzing. Then we create all the source nodes in a for loop. # # Create all the source nodes # for {set i 0 } { $i < $val ( nn ) -1 } { incr i } { set src_node ( $i ) [ $ns node ] $src_node ( $i ) random-motion 0 set x [expr $val ( dimx ) / 2 + [ $xrand value ]] set y [expr $val ( dimx ) / 2 + [ $xrand value ]] $src_node ( $i ) set X_ $x $src_node ( $i ) set Y_ $y $src_node ( $i ) set Z_ 0 $ns initial_node_pos $src_node ( $i ) $val ( node_size ) set udp ( $i ) [ new Agent / UDP ] $udp ( $i ) set class_ $i $ns attach-agent $src_node ( $i ) $udp ( $i ) $ns connect $udp ( $i ) $sink set cbr ( $i ) [ new Application / Traffic / CBR ] $cbr ( $i ) set packet_size_ $val ( packetsize ) $cbr ( $i ) set interval_ $val ( interval ) $cbr ( $i ) attach-agent $udp ( $i ) set start [ $trand value ] $ns at $start \"$cbr($i) start\" $ns at $val ( duration ) \"$cbr($i) stop\" } Note that we use UDP here instead of TCP, since we don't need any reliable transfer or congestion control from up layer. Also, we attach an Constant Bit Generator (CBR) as the application. Simulator Control We first define the actions to take when the simulator stops. proc stop {} { global ns tracefd nam stats val sink set bytes [ $sink set bytes_ ] set losts [ $sink set nlost_ ] set pkts [ $sink set npkts_ ] puts $stats \"bytes losts pkts\" puts $stats \"$bytes $losts $pkts\" $ns flush - trace close $nam close $tracefd close $stats } Here we first get the packet statistics from LossMonitor , and write them to the stats file, then we flush ns trace and close all the files. Finally, we start the simulator. puts \"Starting Simulation...\" $ns run","tags":"network","title":"Simulate Random MAC Protocol in NS2 (Part I)"},{"url":"http://jhshi.me/2013/12/13/how-to-apply-downloaded-ota-package/index.html","text":"Suppose you've downloaded the OTA package using Android's DownloadManager , this post discusses how to verify it, and how to apply it at client's side. Copy the Package to Internal Storage By default, DownloadManager will save the downloaded file in external storage, say, /sdcard . To make sure that this package is still accessible after the phone reboots into recovery, we need to copy the package into internal storage. In this case, we will use the /cache partition. File packageFile = new File ( Environment . getDownloadCacheDirectory () + \"/update.zip\" ); if ( packageFile . exists ()) { packageFile . delete (); } FileChannel source = null ; FileChannel dest = null ; try { source = ( new FileInputStream ( downloadedFile )). getChannel (); dest = ( new FileOutputStream ( packageFile )). getChannel (); long count = 0 ; long size = source . size (); do { count += dest . transferFrom ( source , count , size - count ); } while ( count < size ); } catch ( Exception e ) { Log . e ( TAG , \"Failed to copy update file into internal storage: \" + e ); return false ; } finally { try { source . close (); dest . close (); } catch ( Exception e ) { Log . e ( TAG , \"Failed to close file channels: \" + e ); } } Here we use FileChannel from java.nio instead of the native FileOutputStream , for some performance boost. You can find more discussions about java.nio vs. java.io in this stackoverflow thread . UPDATE 2015-11-03 Another way to copy the package is to use IOUtils.copy function from Apache Common library . Put the downloaded commons-io-x.y.jar in your project's lib directory, then: import org.apache.commons.io.IOUtils ; IOUtils . copy ( new FileInputStream ( packageFile ), new FileOutputStream ( new File ( \"/cache/update.zip\" ))); Verify the Signature For security concern, we need to verify that the downloaded OTA package was signed properly with the platform key. You can refer to this post on how to sign the OTA package. We can use the verifyPackage call provided by RecoverySystem class . try { File packageFile = new File ( new URI ( otaPackageUriString )); RecoverySystem . verifyPackage ( packageFile , null , null ); // Log.v(TAG, \"Successfuly verified ota package.\"); return true ; } catch ( Exception e ) { Log . e ( TAG , \"Corrupted package: \" + e ); return false ; } This will verify the package against the platform key stored in /system/etc/security/otacerts.zip . You can also provide your own certs file, of course. But in this case, the default platform key will do. Reboot into Recovery and Apply the Package OK, now we're pretty confident that the downloaded package is in sanity. Let's reboot the phone into recovery and apply it. This is done by the installPackage call. try { File packageFile = new File ( new URI ( otaPackageUriString )); RecoverySystem . installPackage ( context , packageFile ); } catch ( Exception e ) { Log . e ( TAG , \"Error while install OTA package: \" + e ); Log . e ( TAG , \"Will retry download\" ); startDownload (); } If everything is OK, the installPackage call won't return, and the phone will be rebooted into recovery.","tags":"Android","title":"How to Apply Downloaded OTA Package"},{"url":"http://jhshi.me/2013/12/02/remove-the-figure-prefix-of-caption-in-beamer/index.html","text":"Sometimes it's annoying to have a \"Figure\" prefix when you add caption to a figure in beamer. Here is a way of how to eliminate that. \\usepackage { caption } \\captionsetup [figure] { labelformat=empty } Before After","tags":"latex","title":"Remove the Figure Prefix of Caption in Beamer"},{"url":"http://jhshi.me/2013/12/02/place-logo-properly-in-beamer/index.html","text":"It's stylish to place an low-profile yet charming logo in some corner of your slides. Here we use the pgf package to accomplish this. \\usepackage { pgf } \\logo { \\pgfputat { \\pgfxy (9.45,1.5) }{ \\pgfbox [center,base] { \\includegraphics [width=1.7cm] { logo.png }}}} You probably need to tweak the coordinates a little bit to fit the logo to your slides. Also, to hide logo and also page number on title page, you'll need to this. \\begin { document } { % no page #, no logo on title page \\setbeamertemplate { footline }{} \\setbeamertemplate { logo }{} \\begin { frame } \\titlepage \\end { frame } } % other frames \\end { document } Here is how they look like.","tags":"latex","title":"Place Logo Properly in Beamer"},{"url":"http://jhshi.me/2013/12/02/remove-the-navigation-bar-of-beamer/index.html","text":"Honestly, I never clicked the navigation bar in beamer slides, and I really wondered whether anyone has used it or not. I suspect that most people keep it just to show the pride that he's using beamer... Anyways, the navigation bar can be annoying sometimes, especially when you have some long foot notes that overlap it. Put this line in the preamble to remove the navigation bar. \\beamertemplatenavigationsymbolsempty","tags":"latex","title":"Remove the Navigation Bar of Beamer"},{"url":"http://jhshi.me/2013/12/02/how-to-use-downloadmanager/index.html","text":"DownloadManager is a service provided by Android that can conduct long-running HTTP downloads, typically for large files. So we do not need to worry about connection loss connection, system reboots, etc. Listen for Download Complete Event Before we start downloading, make sure we already listen for the broadcast of Downloadmanager , so that we won't miss anything. 1 2 3 4 5 6 7 8 9 10 11 12 private String downloadCompleteIntentName = DownloadManager . ACTION_DOWNLOAD_COMPLETE ; private IntentFilter downloadCompleteIntentFilter = new IntentFilter ( downloadCompleteIntentName ); private BroadcastReceiver downloadCompleteReceiver = new BroadcastReceiver () { @Override public void onReceive ( Context context , Intent intent ) { // TO BE FILLED } } // when initialize context . registerReceiver ( downloadCompleteReceiver , downloadCompleteIntentFilter ); Request Download We an get an instance of DownloadManager using this call. 1 DownloadManager downloadManager = ( DownloadManager ) context . getSystemService ( Context . DOWNLOAD_SERVICE ); DownloadManager has a subclass called Request , which we will use to request for an download action. Here is the code snippet that initiate a download. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 String url = \"http://example.com/large.zip\" ; DownloadManager . Request request = new DownloadManager . Request ( Uri . parse ( url )); // only download via WIFI request . setAllowedNetworkTypes ( DownloadManager . Request . NETWORK_WIFI ); request . setTitle ( \"Example\" ); request . setDescription ( \"Downloading a very large zip\" ); // we just want to download silently request . setVisibleInDownloadsUi ( false ); request . setNotificationVisibility ( DownloadManager . Request . VISIBILITY_HIDDEN ); request . setDestinationInExternalFilesDir ( context , null , \"large.zip\" ); // enqueue this request DownloadManager downloadManager = ( DownloadManager ) context . getSystemService ( Context . DOWNLOAD_SERVICE ); downloadID = downloadManager . enqueue ( request ); Please refer to the doc on more configurations of the request object. So now we have an downloadID , which we'll use to query the state of downloading. Download Complete Handler Now we already started downloading, in the above downloadCompleteReceiver , what we need to do? First, we need to check if it's for our download, since it's an broadcast event. 1 2 3 4 5 long id = intent . getLongExtra ( DownloadManager . EXTRA_DOWNLOAD_ID , 0L ); if ( id != downloadID ) { Log . v ( TAG , \"Ingnoring unrelated download \" + id ); return ; } Then we need to query the state of downloading. This is done via the Query subclass of DownloadManager . 1 2 3 4 5 6 7 8 9 10 DownloadManager downloadManager = ( DownloadManager ) context . getSystemService ( Context . DOWNLOAD_SERVICE ); DownloadManager . Query query = new DownloadManager . Query (); query . setFilterById ( id ); Cursor cursor = downloadManager . query ( query ); // it shouldn't be empty, but just in case if (! cursor . moveToFirst ()) { Log . e ( TAG , \"Empty row\" ); return ; } Then we can get the state and also the downloaded file information like this. 1 2 3 4 5 6 7 8 int statusIndex = cursor . getColumnIndex ( DownloadManager . COLUMN_STATUS ); if ( DownloadManager . STATUS_SUCCESSFUL != cursor . getInt ( statusIndex )) { Log . w ( TAG , \"Download Failed\" ); return ; } int uriIndex = cursor . getColumnIndex ( DownloadManager . COLUMN_LOCAL_URI ); String downloadedPackageUriString = cursor . getString ( uriIndex ); So now we get the downloaded file's URI, we can than either copy it to somewhere else, or go ahead and process it. There are more information to query when the download failed, e.g., reason, how much as been downloaded, etc. Please refer to the documentation of DownloadManager for complete list of column names.","tags":"android","title":"How to Use Android DownloadManager"},{"url":"http://jhshi.me/2013/12/02/how-to-create-and-sign-ota-package/index.html","text":"I'm currently maintaining the Conductor App for PhoneLab testbed . One of the core tasks performed by conductor is to perform system OTA update, so that we can push platform changes to our participants, either to fix bugs, or to do system level experiments (libc, Dalvik VM, etc.). So the first step is, how to create an OTA package? Directory Structure Suppose we have a patched version of libc and we want to overwrite the previous one already in participants' phone, we need first to figure out where that file is in the file system. In this case, it's /system/lib/libc.so . Then our OTA package's directory structure must looks like this: myupdate | -- META-INF | ` -- com | ` -- google | ` -- android | | -- update-binary | ` -- updater-script ` -- system ` -- lib ` -- libc.so The update-binary and updater-script are used to actually perform the update, I'll explain them later. Note that the structure of the system needs to be exactly the same with what's in Android's setup, so that we can copy that directory directly to target system, and overwrite the files with the updated version. The updater-script The update-binary , as its name indicates, is a binary file that will parse the updater-script we write. It's quite standard and nothing special. You can obtain a copy of this file here . The updater-scipts contains the operations we want to perform. Its written using Edify scripting language, which has quite simple and intuitive syntax. You can find more details in this xda thread . In this case, what we need to do is quite simple: mount the /system partition and copy the files in the OTA package to target file system. So the updater-script may looks like this: mount(\"ext4\", \"EMMC\", \"/dev/block/platform/omap/omap_hsmmc.0/by-name/system\", \"/system\"); package_extract_dir(\"system\", \"/system\"); unmount(\"/system\"); First, we mount the target file system's system partition using the mount command, the arguments are: FSTYPE : File system type. In this case, it's \"ext4\" TYPE : Storage type. \"EMMC\" means internal solid state storage device on MMC bus, which is actually NAND flash. DEV : The device to mount. PATH : Mount point. You can find all the mounted devices in Android by adb shell then mount . Here is one sample output: shell@android:/ $ mount rootfs / rootfs ro,relatime 0 0 tmpfs /dev tmpfs rw,nosuid,relatime,mode=755 0 0 devpts /dev/pts devpts rw,relatime,mode=600 0 0 proc /proc proc rw,relatime 0 0 sysfs /sys sysfs rw,relatime 0 0 none /acct cgroup rw,relatime,cpuacct 0 0 tmpfs /mnt/secure tmpfs rw,relatime,mode=700 0 0 tmpfs /mnt/asec tmpfs rw,relatime,mode=755,gid=1000 0 0 tmpfs /mnt/obb tmpfs rw,relatime,mode=755,gid=1000 0 0 none /dev/cpuctl cgroup rw,relatime,cpu 0 0 /dev/block/platform/omap/omap_hsmmc.0/by-name/system /system ext4 ro,relatime,barrier=1,data=ordered 0 0 /dev/block/platform/omap/omap_hsmmc.0/by-name/efs /factory ext4 ro,relatime,barrier=1,data=ordered 0 0 /dev/block/platform/omap/omap_hsmmc.0/by-name/cache /cache ext4 rw,nosuid,nodev,noatime,errors=panic,barrier=1,nomblk_io_submit,data=ordered 0 0 /dev/block/platform/omap/omap_hsmmc.0/by-name/userdata /data ext4 rw,nosuid,nodev,noatime,errors=panic,barrier=1,nomblk_io_submit,data=ordered 0 0 /sys/kernel/debug /sys/kernel/debug debugfs rw,relatime 0 0 /dev/fuse /mnt/shell/emulated fuse rw,nosuid,nodev,relatime,user_id=1023,group_id=1023,default_permissions,allow_other 0 0 Then we do the actual copy using package_extract_dir command. This will copy the updated libc.so file. And finally we unmount the /system partition. Pack It Up Inside myupdate directory , use this command to create the zip file. zip -r9 ../myupdate.zip * Note that the command is executed inside the myupdate directory, and the zip file is created in parent directory. This is because the META-INF and system directory must be in the root directory of the final zip file. Sign the OTA Package Up to this point, the OTA package we just created should be able to applied successfully on custom recoveries like CWM, in which the signature verification is turned off by default. However, to automate the OTA process, we're using the Android RecoverySystem to reboot the phone and apply the update, in that case, the signature verification is turned on. So we need to sign the package with proper keys, which are platform keys. Suppose you've get the platform keys named platform.x509.pem and platform.pk8 , we can use the signapk.jar tool. java -jar signapk.jar -w platform.x509.pem platform.pk8 myupdate.zip myupdate-signed.zip Note that: We need the -w flag to sign the whole zip file. The sequence of the two key files: pem file goes first, then the pk8 file. This will generate the final OTA package, myupdate-signed.zip , which WILL pass the signature verification of the recovery system.","tags":"Android","title":"How to Create and Sign OTA Package"},{"url":"http://jhshi.me/2013/12/02/command-dispatching/index.html","text":"In a few network projects, we're asked to write an interactive shell, to receive command from user input. Here is the general pattern I used. The example I used here is from the P2P network project , and you can find my earlier post about use select to monitor user input and socket at the same time . Command Handing Functions Since each command may have various number of arguments or options, it's straightforward to use the standard argc and argv interface. So for each command, we define there handling functions as follows. int cmd_help ( int argc , char * argv []); int cmd_myip ( int argc , char * argv []); int cmd_myport ( int argc , char * argv []); int cmd_register ( int argc , char * argv []); int cmd_connect ( int argc , char * argv []); int cmd_list ( int argc , char * argv []); int cmd_terminate ( int argc , char * argv []); int cmd_exit ( int argc , char * argv []); int cmd_download ( int argc , char * argv []); int cmd_creator ( int argc , char * argv []); int cmd_packet ( int argc , char * argv []); Command Table It'll be tedious to manually decide which handling function to call. Instead, we'll use an data structure called Command Table to gracefully handle the cases for all commands. struct { char * name ; int ( * handler )( int argc , char * argv []); char * help_msg ; } cmd_table [] = { { \"HELP\" , cmd_help , \": Show available user interface options.\" }, { \"MYIP\" , cmd_myip , \": Show IP address of this process.\" }, { \"MYPORT\" , cmd_myport , \": Show port on which this process is listening.\" }, { \"REGISTER\" , cmd_register , \" <server_IP> <port_no>: Client register to server.\" }, { \"CONNECT\" , cmd_connect , \" <destination> <port_no>: Connect to a peer client.\" }, { \"LIST\" , cmd_list , \": Show list of connected hosts.\" }, { \"TERMINATE\" , cmd_terminate , \" <connection_id>: Terminate a certain connection\" }, { \"EXIT\" , cmd_exit , \": Close all connections and terminate this process.\" }, { \"DOWNLOAD\" , cmd_download , \" <file_name> <file_chunk_size_in_bytes>: Download a file in parallel.\" }, { \"CREATOR\" , cmd_creator , \": Show author's info.\" }, { \"PACKET\" , cmd_packet , \" <packet_size_in_bytes>: Set packet size.\" }, { NULL , NULL , NULL }, }; Here we define, for each command, which handler to use and also the help message. More specifically, our cmd_help and be written as simple as follows. int cmd_help ( int argc , char * argv []) { printf ( \"Available commands are: \\n \" ); for ( int i = 0 ; cmd_table [ i ]. name != NULL ; i ++ ) { printf ( \"%s%s \\n \" , cmd_table [ i ]. name , cmd_table [ i ]. help_msg ); } return 0 ; } Command Dispatching Now suppose you already found STDIN_FILENO is available to read using select , which means user has entered some input and hit the {% key Enter %} key. Then we need to read the input and dispatch the command. int handle_command ( void ) { char * command = NULL ; size_t len ; /* let getline allocate memory for us */ if ( getline ( & command , & len , stdin ) < 0 ) { perror ( \"getline\" ); return - 1 ; } if ( cmd_dispatch ( command ) < 0 ) { return - 1 ; } free ( command ); return 0 ; } Here we use the getline function to read the input from stdin . getline will allocate the buffer for us, so we need not worry about the input size. But we do need to free the buffer afterwards. int cmd_dispatch ( char * cmd ) { char * argv [ 512 ]; int argc = 0 ; for ( char * word = strtok ( cmd , \" \\t\\n \" ); word != NULL ; word = strtok ( NULL , \" \\t\\n \" )) { if ( argc >= 512 ) { printf ( \"[ERROR]: too many arguments \\n \" ); return - 1 ; } argv [ argc ++ ] = word ; } if ( argc == 0 ) { return 0 ; } for ( int i = 0 ; cmd_table [ i ]. name != NULL ; i ++ ) { if ( ! strcmp ( argv [ 0 ], cmd_table [ i ]. name )) { return cmd_table [ i ]. handler ( argc , argv ); } } printf ( \"[ERROR]: command not found. \\n \" ); return - 1 ; } In cmd_dispatch , we first split the inputs into an array of strings, then we traverse the command table to find a match.","tags":"network","title":"Command Dispatching"},{"url":"http://jhshi.me/2013/11/17/post-revision-plugin-for-octopress/index.html","text":"Writing blogs is not a one-time thing. Maybe sometime after you posted a blog, you find a typo, or you get some feedback from your readers and want to further elaborate on some paragraph in your blog, and so on. So keep a revision history for each post is a good idea, not only for you, but also for your readers, to let them know that you're keep polishing your blogs. However, doing this manually is kind of tedious, especially when you made multiple changes you want to show. Fortunately, you use static site generator (like Jekyll or Octopress ) and use git to manage your content. (What? You don't? The I feel sad for you :-) So why don't just show the git revision history for that blog? This is the octopress-post-revision comes for. If you feel interested, please refer to the README page on how to install this plugin and how to configure it. This post will give a detailed description on how this plugin works. The idea is simple, yet implementing it is not trivial. It's more difficult for me since this is my first time trying to write some code in ruby... But let's break down the task into pieces ant tackle them one by one. Get Post's Path On You Local File System We need these information since we need to do a git log on those files. Jekyll provides the Generator interface which allows us to generate extra information, which is exactly what we want. We need three piece of information: Post file's full/absolute path Post file name Post file's dir name The last two information are used to generate the View on Github link. This is what the PostFullPath looks like. class PostFullPath < Generator safe :true priority :high # Generate file info for each post and page # +site+ is the site def generate ( site ) site . posts . each do | post | base = post . instance_variable_get ( :@base ) name = post . instance_variable_get ( :@name ) post . data . merge! ({ 'dir_name' => '_posts' , 'file_name' => name , 'full_path' => File . join ( base , name ), }) end site . pages . each do | page | base = page . instance_variable_get ( :@base ) dir = page . instance_variable_get ( :@dir ) name = page . instance_variable_get ( :@name ) page . data . merge! ({ 'dir_name' => dir , 'file_name' => name , 'full_path' => File . join ( base , dir , name )}) end end end The Post class has several instance variables (e.g., @base, @name ) that has the file information, yet how to get them outside the class got me. After Google a bit, this thread gives me the solution, i.e., the instance_variable_get method. Another thing to note is the dir_name , since Jekyll assumes post files are put in the _post directory, so we can hard code post['dir_name'] as _posts . Yet for pages, we need the real dir name. The revision Liquid Tag Once we got the file information, we can use git to get the change history of that file. We also need to format the logs for display purpose. Here is the code that fetch logs from git : cmd = 'git log --date=local --pretty=\"%cd|%s\" --max-count=' + @limit . to_s + ' ' + full_path logs = ` #{ cmd } ` We specify the date format as local , and the log message as customized format. %cd means commit date, and %s is the subject. We also limit the number of logs, in case you get to many commit on on post. The View on Github Link Since we only display the latest @limit number of commit, we provide the View on Github link which links to the Github's commit history page. The format of the URL is https://github.com/<user>/<repo>/commits/<branch>/<file_path> Here is the code that get branch information. cmd = 'git rev-parse --abbrev-ref HEAD' # chop last '\\n' of branch name branch = ` #{ cmd } ` . chop Now we have all the information we need, and here is how we compose the final URL. link = File . join ( 'https://github.com' , site [ 'github_user' ] , site [ 'github_repo' ] , 'commits' , branch , site [ 'source' ] , post [ 'dir_name' ] , post [ 'file_name' ] )","tags":"octopress","title":"Post Revision Plugin For Octopress"},{"url":"http://jhshi.me/2013/11/10/popular-posts-plugin-for-octopress/index.html","text":"This post describes the octopress-popular-posts for Octopress. Although there is one plugin that does the job, it used Google Page Rank to determine if a post is popular or not. I'd like to, however, use the page view of the post as metric. How To Use In another post , I described how to use the octopress-page-view plugin to show the PV of each post and the whole site. This plugin depend on that to generate each post's PV count. So you need to first install that plugin. Installation Clone the repo from Github cd /tmp git clone https://github.com/jhshi/octopress-popular-posts.git cd octopress-popular-posts The structure of the directory will look like this octopress-popular-posts/ |-- _config.yml |-- plugins | `-- popular_posts.rb |-- README.md `-- source `-- _include `-- custom `-- asides `-- popular_posts.html Copy plugins/popular_posts.rb to your plugins directory. And place source/include/custom/asides/popular_posts.html in your custom asides directory. Add this asides to your asides list in _config.yml Configuration This plugin doesn't need any special configurations, as long as you set octopress-page-view plugin correctly, it should work out of box. There is one parameters you can tune, though. You can set how many popular posts will be shown in popular_posts.html How It Works octopress-page-view has done all the hard job for us. All we need to do is just sort the posts by their page view count. Note that we need to set the priority of this plugin as low , since we reply on octopress-page-view plugin to run first to generate the PV count. class PopularPosts < Generator safe :true priority :low def generate ( site ) # require octopress-page-view plugin if ! site . config [ 'page-view' ] return end popular_posts = site . posts . sort do | px , py | # just catch the rare case if px . data [ '_pv' ] == nil || py . data [ '_pv' ] == nil then 0 elsif px . data [ '_pv' ] > py . data [ '_pv' ] then - 1 elsif px . data [ '_pv' ] < py . data [ '_pv' ] then 1 else 0 end end site . config . merge! ( 'popular_posts' => popular_posts ) end end One trick I did here is that, site object has not data field to merge into, so I merge the popular_posts data to site.config .","tags":"octopress","title":"Popular Posts Plugin for Octopress"},{"url":"http://jhshi.me/2013/11/10/page-view-plugin-for-octopress/index.html","text":"It's always nice to display some blog stats, such as page view count, to give readers an sense how popular some site/posts are. Unfortunately, there is (or should I say 'was'?) no such plugin that does this job nicely for Octopress , so I decided to write one myself. And here comes the plugin called octopress-page-view . I use Google Analytics to track my blog. And there is an Octopress plugin called jekyll-ga , which can sort blog posts by certain metrics of Google Analytics. For me, chronological order works just fine. So I just grab the part that fetch data from Google Analytics. I haven't done any decent ruby coding before, so bear with me if I wrote some silly ruby code. But it works. How To Use Get the plugin Install required gems sudo gem install chronic google-api-client Clone the repository cd /tmp git clone https://github.com/jhshi/octopress-page-view.git cd octopress-page-view The structure of the directory will look like this octopress-page-view/ | -- _config.yml | -- plugins | ` -- page_view.rb | -- README.md ` -- source ` -- _include ` -- custom ` -- asides ` -- pageview.html Copy plugins/page_view.rb to your plugins directory, and copy source/_include/custom/asides/pageview.html to your custom asides directory. In your _config.yml , add pageview.html to your asides array. Setup and Configuration The README file of the jekyll-ga project gives an very detailed description about how to set up a service account for Google data API , which I'm not going to repeat here. After you've set up the service account, you'll need to add some configurations to your _config.yml file. Here is a sample configuration. # octopress-page-view page-view : service_account_email : # XXXXXX@developer.gserviceaccount.com key_file : privatekey.p12 # service account private key file key_secret : notasecret # service account private key's password profileID : # ga:XXXXXXXX start : 3 years ago # Beginning of report end : now # End of report metric : ga:pageviews # Metric code segment : gaid::-1 # All visits filters : # optional How It Works This plugin provides an Jekyll Generator , called GoogleAnalytics , to fetech data from Google, and a Jekyll Liquid Tag to actually generate the formated page view count. Fetch Analytics Data This part is adapted from jekyll-ga . Basically, we will create an Google API client, and after proper authorization, making request to Google. pv = site . config [ 'page-view' ] # need to provide application_name and application_version, otherwise, APIClient # will warn ... client = Google :: APIClient . new ( :application_name => 'octopress-page-view' , :application_version => '1.0' , ) # Load our credentials for the service account key = Google :: APIClient :: KeyUtils . load_from_pkcs12 ( pv [ 'key_file' ] , pv [ 'key_secret' ] ) client . authorization = Signet :: OAuth2 :: Client . new ( :token_credential_uri => 'https://accounts.google.com/o/oauth2/token' , :audience => 'https://accounts.google.com/o/oauth2/token' , :scope => 'https://www.googleapis.com/auth/analytics.readonly' , :issuer => pv [ 'service_account_email' ] , :signing_key => key ) # Request a token for our service account client . authorization . fetch_access_token! analytics = client . discovered_api ( 'analytics' , 'v3' ) # prepare parameters params = { 'ids' => pv [ 'profileID' ] , 'start-date' => Chronic . parse ( pv [ 'start' ] ) . strftime ( \"%Y-%m-%d\" ), 'end-date' => Chronic . parse ( pv [ 'end' ] ) . strftime ( \"%Y-%m-%d\" ), 'dimensions' => \"ga:pagePath\" , 'metrics' => pv [ 'metric' ] , 'max-results' => 100000 , } if pv [ 'segment' ] params [ 'segment' ] = pv [ 'segment' ] end if pv [ 'filters' ] params [ 'filters' ] = pv [ 'filters' ] end response = client . execute ( :api_method => analytics . data . ga . get , :parameters => params ) results = Hash [ response . data . rows ] So now we have a hash about query results. Calculate Page View For each blog post, we want to display just the page view of that blog. However, in blog index pages, we want to display the total page view of this site. So we process post and page slightly differently. Also, we'll set our generator's priority to high , in case other plugins also want to use the _pv information. # total page view of this site tot = 0 # display per post page view site . posts . each { | post | url = ( site . config [ 'baseurl' ] || '' ) + post . url + 'index.html' hits = ( results [ url ] )? results [ url ]. to_i : 0 post . data . merge! ( \"_pv\" => hits ) tot += hits } # calculate total page view site . pages . each { | page | url = ( site . config [ 'baseurl' ] || '' ) + page . url hits = ( results [ url ] )? results [ url ]. to_i : 0 tot += hits } # display total page view in page site . pages . each { | page | page . data . merge! ( \"_pv\" => tot ) } So now each post or page contains one ore field, called _pv , which is the page view count of that post , or total PV for page . Display Page View This is done using a Liquid Tag called PageViewTag . In the render method, we just output an nicely formatted page view count. site = context . environments . first [ 'site' ] if ! site [ 'page-view' ] return '' end post = context . environments . first [ 'post' ] if post == nil post = context . environments . first [ 'page' ] if post == nil return '' end end pv = post [ '_pv' ] if pv == nil return '' end html = pv . to_s . reverse . gsub ( /...(?=.)/ , '\\&,' ) . reverse + ' hits' return html","tags":"octopress","title":"Page View Plugin for Octopress"},{"url":"http://jhshi.me/2013/11/03/pop-up-alertdialog-in-system-service/index.html","text":"I've been working on OTA support for PhoneLab testbed . And one problem I encountered is that, when I tried to pop out an AlertDialog to let user confirm update, I get this error that saied something like this: android.view.WindowManager $ BadTokenException : Unable to add window -- token null is not for an application Apparently, the context I used to create the dialog, which is the service context, is not valid in the sense that it has not windows attached. Yet create an Activity just to pop out a alert dialog is a bit of overdone, since my app is basically a background service. Here is how I solved this problem. Add android.permission.SYSTEM_ALERT_WINDOW permission to AndroidManifest.xml <uses-permission android:name= \"android.permission.SYSTEM_ALERT_WINDOW\" /> After creating the dialog, before show it, set its window type to system alert. // builder set up code here // ... AlertDialog dialog = builder . create (); dialog . getWindow (). setType ( WindowManager . LayoutParams . TYPE_SYSTEM_ALERT ); dialog . show (); Ref: stackoverflow thread , another similar post","tags":"Android","title":"Pop Up AlertDialog in System Service"},{"url":"http://jhshi.me/2013/11/02/fight-against-the-address-alrady-in-use-error/index.html","text":"You have probably seen this error quite often. The detailed reason why this error occurs is explained in detail here . In short, if a TCP socket is not closed properly when the program exits, the OS will put that socket in a TIME_WAIT state for a period of time ( 2MSL , usually a couple of minutes). During that time, if you want to bind to the same port, you'll get the \"Address already in use\" error, even though technically no body is actually using that port. In practice, especially when you're debugging, it's very annoying to wait (even a few minutes) before you can re-run your program if it crashes previously. And you often very sure you're the only one that will use that certain port number. The solution is, you can use the SO_REUSEADDR option to avoid that binding error. /* reuse server port, since the OS will prevent us to bind to this port * immediately after we close the sock */ int optval = 1 ; if ( setsockopt ( server_sock , SOL_SOCKET , SO_REUSEADDR , & optval , sizeof ( optval )) != 0 ) { perror ( \"setsockopt\" ); return - 1 ; } Now you can happily bind to that port again, again, and again... Resources Here is a few stackoverflow threads that discussing what happend to the old open socket , and the difference between SO_REUSEADDR and SO_REUSEPORT .","tags":"network","title":"Fight Against the 'Address alrady in use' Error"},{"url":"http://jhshi.me/2013/11/02/use-select-to-monitor-multiple-file-descriptors/index.html","text":"In the P2P network project , we were asked to simultaneously monitor user input and also potential in-coming messages, yet we're not supposed to use multiple threads or processes. That leaves us no choice but the select function. In short, select allows you to monitor multiple file descriptors at the same time, and tells you when some of them are available to read or write. fd_set Operations fd_set is fixed-size buffer that can host a few (up to FD_SETSIZE ) file descriptors. sys/select.h provide a few macros to manipulate the fd_set . void FD_CLR ( int fd , fd_set * set ); int FD_ISSET ( int fd , fd_set * set ); void FD_SET ( int fd , fd_set * set ); void FD_ZERO ( fd_set * set ); Basically FD_CLR will remove a fd from the fd_set FD_ISSET will test if a certain fd in the fd_set or not. FD_SET will add a fd to the fd_set FD_ZERO will clear the fd_set Improved fd_set Wrappers In practice, you'll often need to maintain a fd_set together with the maximun fd in that set (more on this later). So I use a few wrappers to update the fd_set and the max_fd at the same time. #include <sys/select.h> #include <assert.h> /* add a fd to fd_set, and update max_fd */ int safe_fd_set ( int fd , fd_set * fds , int * max_fd ) { assert ( max_fd != NULL ); FD_SET ( fd , fds ); if ( fd > * max_fd ) { * max_fd = fd ; } return 0 ; } /* clear fd from fds, update max fd if needed */ int safe_fd_clr ( int fd , fd_set * fds , int * max_fd ) { assert ( max_fd != NULL ); FD_CLR ( fd , fds ); if ( fd == * max_fd ) { ( * max_fd ) -- ; } return 0 ; } The select Function The prototype of the function looks like this: int select ( int nfds , fd_set * readfds , fd_set * writefds , fd_set * exceptfds , struct timeval * timeout ); In our case, we only want to monitor a set of fds that are available to read, so we don't really care about the writefds or exceptfds , just leave them as NULL . A key point here is that, console is also a file, with fd is STDIN_FILENO , just as other files (socket, normal file, etc.). So to monitor user input as well as socket, we only need to add their fds to the readfds . Another trick is that, nfds is the highest-numbered file descriptor in readfds , plus 1 . So you'll want to set nfds as max_fd+1 . Also, note that select will modify the readfds you passed in, so you'll definitely back up your readfds before calling select . In this project, if nothing happens (no user input and no incoming message), we just wait, so timeout parameter is not used here. Connect the Dots We usually call select inside a while loop to keep monitoring possible inputs. Here is the code snippets that demonstrate the typical usage of select . fd_set master; /* add stdin and the sock fd to master fd_set */ FD_ZERO(&master); safe_fd_set(STDIN_FILENO, &master, &max_fd); safe_fd_set(server_sock, &master, &max_fd); char prompt[512]; sprintf(prompt, \"[%s@%s] $ \", is_server?\"server\":\"client\", hostname); while (1) { printf(\"\\r%s\", prompt); fflush(stdout); /* back up master */ fd_set dup = master; /* note the max_fd+1 */ if (select(max_fd+1, &dup, NULL, NULL, NULL) < 0) { perror(\"select\"); return -1; } /* check which fd is avaialbe for read */ for (int fd = 0; fd <= max_fd; fd++) { if (FD_ISSET(fd, &dup)) { if (fd == STDIN_FILENO) { handle_command(); } else if (fd == server_sock) { printf(\"\\n\"); handle_new_connection(); } else { handle_message(fd); } } } }","tags":"network","title":"Use Select to Monitor Multiple File Descriptors"},{"url":"http://jhshi.me/2013/11/02/how-to-get-hosts-ip-address/index.html","text":"I encounter this problem while doing an network course project . Easy as it sounds, it's actually not a trivial task. Old-fashioned gethostbyname I did some network programing in old days, so I was tempted to use the straightforward way using gethostbyname . #include <unistd.h> #include <netdb.h> char hostname [ 256 ]; if ( gethostname ( hostname , sizeof ( hostname )) < 0 ) { perror ( \"gethostname\" ); return - 1 ; } struct hostent * host = gethostbyname ( hostname ); if ( host == NULL ) { perror ( \"gethostbyname\" ); return - 1 ; } struct in_addr ip = * ( struct in_addr * ) host -> h_addr_list [ 0 ]; printf ( \"My IP is %s \\n \" , inet_ntoa ( ip )); Yet when I run the program, this code snippet will always print out 127.0.0.1 , which is not useful since I want to get the real (or external ) IP address. Apparently, this is because some nasty settings in the /etc/hosts file, there is an entry looks like this 127.0.0.1 timberlake.cse.buffalo.edu timberlake localhost.localdomain localhost Since gethostbyname is actually a DNS looking up process, that DNS request, unfortunately, is served by the /etc/hosts file, instead of a real decent DNS server. More Advanced getifaddrs I searched the web and found this stackoverflow threads talking about using getifaddrs to get NIC's IP address. I tried and it seems to work. Since the machine I worked on uses \"eth0\" as external NIC, so when looping the result, I just match the results that has the name \"eth0\". Although it works well, the solution is a little bit ad-hoc. Since the network interface's name is not necessarily \"eth0\", for example, in some laptop or netbook, the primary interface may be \"wlan0\" instead of \"eth0\". Most Elegant Way Finally, I adopted the solution that mentioned later on that thread. Basically, I connected to a well-known server (e.g., Google's DNS server) and then get my local socket's information (more specifically, IP) using getsockname . Here is the final code snippet. /* get my hostname */ char hostname [ 256 ] ; if ( gethostname ( hostname , sizeof ( hostname )) < 0 ) { perror ( \"gethostname\" ); return -1 ; } // Google ' s DNS server IP char * target_name = \"8.8.8.8\" ; // DNS port char * target_port = \"53\" ; /* get peer server */ struct addrinfo hints ; memset (& hints , 0 , sizeof ( hints )); hints .ai_family = AF_INET ; hints .ai_socktype = SOCK_STREAM ; struct addrinfo * info ; int ret = 0 ; if (( ret = getaddrinfo ( target_name , target_port , & hints , & info )) != 0 ) { printf ( \" [ ERROR ] : getaddrinfo error: %s\\n\" , gai_strerror ( ret )); return -1 ; } if ( info- > ai_family == AF_INET6 ) { printf ( \" [ ERROR ] : do not support IPv6 yet.\\n\" ); return -1 ; } /* create socket */ int sock = socket ( info- > ai_family , info- > ai_socktype , info- > ai_protocol ); if ( sock <= 0 ) { perror ( \"socket\" ); return -1 ; } /* connect to server */ if ( connect ( sock , info- > ai_addr , info- > ai_addrlen ) < 0 ) { perror ( \"connect\" ); close ( sock ); return -1 ; } /* get local socket info */ struct sockaddr_in local_addr ; socklen_t addr_len = sizeof ( local_addr ); if ( getsockname ( sock , ( struct sockaddr *)& local_addr , & addr_len ) < 0 ) { perror ( \"getsockname\" ); close ( sock ); return -1 ; } /* get peer ip addr */ char myip [ INET_ADDRSTRLEN ] ; if ( inet_ntop ( local_addr .sin_family , &( local_addr .sin_addr ), myip , sizeof ( myip )) == NULL ) { perror ( \"inet_ntop\" ); return -1 ; }","tags":"network","title":"How to Get Local Host's Real IP Address"},{"url":"http://jhshi.me/2013/10/02/persist-and-synchroize-vim-undo-history-using-dropbox/index.html","text":"It's extremely useful to - Have a virtually unlimited undo history, and - Have it persisted even after exiting VIM, and - Better, even have it synchronized across all your working machines using Dropbox. Here is the .vimrc snippet I used to do the trick. \" Persist undo set undofile \"maximum number of changes that can be undone set undolevels=9999 \"maximum number lines to save for undo on a buffer reload set undoreload=9999 \" If have Dropbox installed, create a undo dir in it if isdirectory(expand(\" $ HOME /Dropbox/\")) silent !mkdir -p $ HOME /Dropbox/.vimundo >/dev/null 2>&1 set undodir= $ HOME /Dropbox/.vimundo// else \" Otherwise, keep them in home silent !mkdir -p $ HOME /.vimundo >/dev/null 2>&1 set undodir= $ HOME /.vimundo// end Note the double slash after the undodir , it tells VIM to name the undo file using the full path of the editing file, so no naming collision will occur.","tags":"vim","title":"Persist and Synchroize VIM undo History using Dropbox"},{"url":"http://jhshi.me/2013/09/03/importerror-cannot-import-name-compare-xml/index.html","text":"When I tried to fire up Django server using manage.py , I kept getting this error,which is cause by from django.test.utils import compare_xml . It turns out that I'm using the wrong Django version (1.4), and I should upgrade to 1.5. The easiest way to upgrade is using easy_install # install easy_install if you haven't done so sudo apt-get install python-setuptools # now upgrade sudo easy_install --upgrade django And the python file that actually contains the compare_xml method is located in (in my case): /use/local/lib/python2.7/dist-packages/Django-1.5.2-py2.7.egg/django/test/utils.py But in the process of figuring out this issue, I learned several things. When importing django modules, e.g., you have to define the DJANGO_SETTINGS_MODULES environment variable. Just set it to your project's settings.py will be OK. To find out what methods are provided in a module and various other information, say django.test.utils , you can use this command in shell: $ DJANGO_SETTINGS_MODULES = settings python -c \"import django.test.utils;help(django.test.utils);\"","tags":"errors","title":"ImportError: cannot import name compare_xml"},{"url":"http://jhshi.me/2013/04/13/userwarning-module-dap-was-already-imported-from-none/index.html","text":"I installed python-matploglib and python-mpltoolkits.basemap using apt , but when I tried to import Basemap using from mpltoolkits.basemap import Basemap , the following warning shows up: usr/lib/pymodules/python2.7/mpl_toolkits/__init__.py:2: UserWarning: Module dap was already imported from None, but /usr/lib/python2.7/dist-packages is being added to sys.path __import__('pkg_resources').declare_namespace(__name__) To resolve this warning, edit the file /usr/lib/python2.7/dist-packages/dap-2.2.6.7.egg-info/namespace_packages.txt , add dap as the first line. dap dap.plugins dap.responses Ref: Stackoverflow Question","tags":"errors","title":"UserWarning: module dap was already imported from None"},{"url":"http://jhshi.me/2013/04/07/speed-up-octopress-generation/index.html","text":"rake generate can take quite a while, especially when you have many blog posts. Here are a few tips on how to speed up the generation process. Use rake isolate and rake integrate It's usually the case when you have many existing posts, while you're modifying a few of them, it's certainly a overkill to compile all the posts if you just ant to preview what you're really editing. Octopress provide an isolate command just for this purpose. The idea is, you can use rake isolate to move all no-interested posts in an separate directory outside source/_posts , so when you do rake generate , you'll just compile those posts you care about. When you're done editing and want to deploy your sites, you can use rake integrate to move those posts back and generate a complete site. The usage of rake isolate is simple, you just provide the keywords, and those posts whose title contain these keywords are kept, other posts are moved to source/_stash . Say I'm composing a post named 2013-04-07-hello-world.markdown , and assume this post is the only one that contains hello in its title. Then the following command will do the job: $ rake isolate [ hello ] Use rb-gsl to boost lsi computation Jekyll has builtin support for related posts, so as Octopress. You just need to add this line to your _config.yml : lsi : true Once you enabled lsi , you'll definite want to install rb-gsl package to make the related post generation process faster. When Octopress remind you that: Notice : for 10 x faster LSI support , please install http :// rb - gsl . rubyforge . org / It's not kidding! Note that Octopress doesn't work with the latest gsl versioned 1.15.* . You'll need to install gsl 1.14 manually since apt or yum will probably install 1.15.* for you. wget http://ftp.gnu.org/gnu/gsl/gsl-1.14.tar.gz tar xvf gsl-1.14.tar.gz cd gsl-1.14 ./configure make sudo make install Check the installation by the gsl-config command: gsl-config --version 1.14 Then edit your Gemfile in your blog source root. Add the following line in the development group: gem 'gsl' Then use bundle to install it. bundle install You're all set. Now when you do rake generate , you shouldn't see that 10x faster line anymore.","tags":"octopress","title":"Speed up Octopress Generation"},{"url":"http://jhshi.me/2013/04/07/why-i-switched-to-octopress/index.html","text":"I used to blog on wordpress.com. After a year or so, I finally decided to abandoned it and switched to Octopress + Github Pages . Here are the reasons and how I migrated to Octopress. Maybe because I was using wordpress.com , and those who use a self-hosted wordpress have something different to say, the way I see it, wordpress, at least wordpress.com , sucks. Use your favorite editor? No-no I am an Vim addict and I almost use Vim for everything (except for watching videos perhaps). It's extremely uncomfortable using the dumb text input frame embedded in web page. Besides, I often need to insert inlining code or code block in blogs. For inline code, I have to use plain text mode and wrap them using the html <code> tag manually. And for code blocks, I have to use the stupid, unportable [sourcecode] tag. When I realized that awful experience even cool down my passion for blogging, I know it's time to change. With Octopress, I can use Vim to compose blogs locally. For formating, Markdown did a decent enough job. I'm more than happy with these. Page loading speed In Wordpress, everything is stored in database, and the page is generated dynamically when you request it. Despite those caching plugins , why bother dynamic anyway when static pages are just good enough? Using Google's PageSpeed Insights for measurement, my old blog site hosted in wordpress.com got 78 out of 100 score, while this blog got 91 out of 100. Hooray! Migration Jekyll offers several ways to migrate your previous blogs . Octopress is based on Jekyll, so all these ways also apply. I found the Exitwp tool extremely usefully for migrating wordpress blogs. One drawback of Exitwp is it can not handle non-ascii characters so a few of my previous blogs written in Chinese can not be migrated using it.","tags":"octopress","title":"Why I Switched to Octopress"},{"url":"http://jhshi.me/2013/04/05/os161-synchronization-primitives-rwlock/index.html","text":"The idea of Reader-Writer Lock is quite simple. In normal lock , we don't differentiate the threads. That said, each thread who wants to enter the critical section must first acquire the lock. But on a second thought, you may find that threads actually have different behavior inside the critical section: some threads just want to see the values of shared variable, while others really want to update those variables. An Example Suppose we have a book database in a library, each reader who wants to query the database must first acquire the lock before he can actually do the query. The library manager, who wants to update some book info also need to acquire the lock before he can do the actual update. In this case, we can see that the queries of multiple readers in fact have no conflict. So ideally they should be allowed to be in the critical section at the same time. On the other hand, the library manager must have exclusive access to the database while he's updating. No readers, no other managers can enter the critical section until the first manager leaves. So, two rules for rwlock: Multiple readers can be in the critical section at the same time One and only one writer can in the critical section at any time Starvation Suppose the coming sequence of threads are \"RWRRRRR...\", in which R denotes reader and W denotes writer. The first reader arrives, and found no one in the critical section, and he happily comes in. Before he leaves, the writer arrives, but found there is a reader inside the critical section, so the writer wait. While the write is waiting, the second reader comes and find there is one reader inside the critical section, literally, it's OK for him to come in according to the rules, right? The same case applies to the third, forth,..., readers. So without special attention, we see readers come and go, while the poor writer keeps waiting, for virtually a \"unbounded\" time. In this case, the writer is starved. The thing is, the second, third, forth..., readers shouldn't enter critical section since there is a write waiting before them! Implementation There are many ways to implement rwlock. You can use any of the semaphore, cv or lock. Here I introduce one using semaphore and lock. It's very simple, yet has the limitation that only support at most a certain number of readers in the critical section. Let's imagine the critical section as a set of resources. The initial capacity is MAX_READERS . The idea is each reader needs one of these resources to enter the critical section, while each writer needs all of these resources (to prevent other readers or writers) to enter. To let the readers be aware of the waiting writers, each thread should first acquire a lock before he can acquire the resource. So for rwlock_acquire_read : Acquire the lock Acquire a resource using P Release the lock For rwlock_release_read , just release the resource using V . In rwlock_acquire_write : Acquire the lock, so that no other readers/writer would be able to acquire the rwlock Acquire ALL the resources by doing P MAX_READERS times Release the lock. It's safe now since we got all the resources. For rwlock_release_write , just release all the resources.","tags":"os161","title":"OS161 Synchronization Primitives: RWLock"},{"url":"http://jhshi.me/2013/04/05/os161-synchronization-primitives-cv/index.html","text":"Condition variable is used for a thread to wait for some condition to be true before continuing. The implementation is quite simple compared to lock , yet the difficult part is to understand how a CV is supposed to used. CV Interface Condition variable has two interfaces: cv_wait and cv_signal . cv_wait is used to wait for a condition to be true, and cv_signal is used to notify other threads that a certain condition is true. So what? Let's consider a producer-consumer case, where a bunch of threads share a resource pool, some of them (producer) is responsible to put stuff to the pool and others (consumer) are responsible to take stuff from the pool. Obviously, we have two rules. If the pool is full, then producers can not put to the pool If the pool is empty, then consumers can not take stuff from the pool And we use two condition variables for each of these rules: pool_full and pool_empty . Here is the pseudo code for producer and consumer: void producer ( void ) { lock_acquire ( pool_lock ); while ( pool_is_full ) { cv_wait ( pool_full , pool_lock ); } produce (); /* notify that the pool now is not empty, so if any one is waiting * on the pool_empty cv, wake them up */ cv_signal ( pool_empty , pool_lock ); lock_release ( pool_lock ); } void consumer ( void ) { lock_acquire ( pool_lock ); while ( pool_is_empty ) { cv_wait ( pool_empty , pool_lock ); } consume (); /* notify that the pool now is not full, so if any one is waiting * on the pool_full cv, wake them up */ cv_signal ( pool_full , pool_lock ); lock_release ( pool_lock ); } Here we also use a lock to protect access to the pool. We can see from this example: Condition variable is virtually a wait channel Condition variable is normally used together with lock, but condition variable itself doesn't contain a lock What's in cv structure? Obviously, we need a wait channel. And that's it (probably plus a cv_name ). cv_wait and cv_signal Now let's get to business. The comment in $OS161_SRC/kern/inlucde/synch.h basically told you everything you need to do. In cv_wait , we need to: Lock the wait channel Release the lock passed in Sleep on the wait channel When waked up, re-acquire the lock. So before cv_wait , we should already hold the lock (so that we can release it). And after cv_wait , we still hold the lock. In cv_signal , we just wake up somebody in the wait channel using wchan_wakeone .","tags":"os161","title":"OS161 Synchronization Primitives: CV"},{"url":"http://jhshi.me/2013/04/04/os161-synchronization-primitives-lock/index.html","text":"Lock is basically just a semaphore whose initial counter is 1. lock_acquire is like P , while lock_release is like V . You probably want to go over my previous post about semaphore Lock's holder However, since only one can hold the lock at any given time, that guy is considered to be the holder of this lock. While in semaphore, we don't have such a holder concept since multiple thread can \"hold\" the semaphore at the same time. Thus we need to store the holder information in our lock structure, along with the conventional spin lock and wait channel. Intuitively, you may tempted to use the thread name ( curthread->t_name ) as the thread's identifier. Nevertheless, same with the case in real world, the thread's name isn't necessarily unique. The OS161 doesn't forbidden us to create a bunch of threads with the same name. There is a global variable defined in $OS161_SRC/kern/include/current.h named curthread , which is a pointer to the kernel data structure of current thread. Two different threads definitely have different thread structures (hence different pointers), which makes the pointer to thread structure a good enough thread identifier. Reentrant Lock Another trick thing is to decide whether we support reentrant lock or not. Basically, a process can acquire a reentrant lock multiple times without blocking itself. At first glance, you may wonder what kind of dumb thread would acquire a lock multiple times anyway? Well, that kind of thread does exist, and they may not be dumb at all. Reentrant lock is useful when it's difficult for a thread to track whether it has grabbed the lock. Suppose we have multiple threads that traverse a graph simultaneously, and each thread need to first grab the lock of a node before it can visit that node. If the graph has a circle or there are multiple paths leads to the same node, then it's possible that a thread visit the same node twice. Although there is a function named lock_do_i_hold that can tell whether a thread holds a lock or not, unfortunately it's not a public interface of lock. In OS161, it's OK that you choose to not support reentrant lock, so when you detect a thread try to acquire a lock while it's the lock's holder, just panic. But if you want to support reentrant lock, you need to make sure a thread won't accidentally loose a lock. For example, void A ( void ) { lock_acquire ( lock1 ); B (); lock_release ( lock1 ); } void B ( void ) { lock_acquire ( lock1 ); printf ( \"Hello world!\" ); lock_release ( lock1 ); } In this case, the thread is supposed to still hold the lock after B returns. The simplest way would be, keep a counter (initial value 0) for each lock. When a thread acquires a lock, increase that counter. When it release the lock, decrease the counter, only actually release a lock when the counter reaches 0.","tags":"os161","title":"OS161 Synchronization Primitives: Lock"},{"url":"http://jhshi.me/2013/04/04/os161-synchronization-primitives-semaphore/index.html","text":"Semaphore denotes a certain number of shared resources. Basically, it's one counter and two operations on this counter, namely P and V . P is used to acquire one resource (thus decrementing the counter) while V is used to release one resource (thus incrementing the counter). A Metaphor My favorite example is the printer. Say we have three printers in a big lab, where everybody in the lab shared those printers. Obviously only one printing job can be conducted by one printer at any time, otherwise, the printed content would be messed up. However, we can not use a single lock to protect the access of all these three printers. It'll be very dumb. An intuitive way is to use three locks, one for each printer. Yet more elegantly, we use a semaphore with initial counter as 3. Every time before a user submit a print job, he need to first P this semaphore to acquire one printer. And after he is done, he need to V this semaphore to release the printer. If there is already one print job at each printer, then the following poor guys who want to P this semaphore would have to wait. What should a semaphore structure contain? Apparently, we need an counter to record how many resources available. Since this counter is a shared variable, we need a lock to protect it. At this point, we only have the spinlock provided in $OS161_SRC/kern/include/spinlock.h . That's fine since our critical section is short anyway. In order to let the poor guys have a place to wait, we also need an wait channel (in OS161_SRC/kern/include/wchan.h ) P Operation The flow of P would be: Acquire the spin lock Check if there are some resources available ( counter > 0 ) If yes, we're lucky. Happily go to step 8. If no, then we first grab the lock of the wait channel, since the wait channel is also shared. Release the spin lock, and wait on the wait channel by calling wchan_sleep We're sleeping... After wake up, first grab the spin lock, and go to step 2 At this point, the counter should be positive, decrement it by 1 Release the spin lock, and return V Operation V is much simpler compared to P . The flow is: Acquire the spin lock Increment the counter by 1 Wake up some poor guy in the wait channel by calling wchan_wakeone ) Release the spin lock and return","tags":"os161","title":"OS161 Synchronization Primitives: Semaphore"},{"url":"http://jhshi.me/2013/03/15/vimium-not-working-in-google-search-results-page/index.html","text":"If you're Vim user, then you must try Vimium . It makes your browsing much much comfortable! These days, I found that Vimium commands ( j , k , f ) don't work on Google search results page. But works just in in any other pages. I tried turning the instant search off, logging out my account in Google's homepage, turning of personalized search results, etc. None of those work. Then I found that Vimium only stop working if I use Chrome's Omnibox to search. That is, if I do the search in Google's home page instead of Chrome's Omnibox, then everything is fine. I suspect that some extra flags in Omnibox's default search pattern is the reason why Vimium refused to work. But Omnibox is so convenience to use ( Alt+D to focus & search). Opening Google's homepage every time you need search will certainly be another pain. So I changed the default behavior of Chrome's Omnibox. Unfortunately, the built-in Google search pattern is unchangeable, so I added an new search engine entry and set it as default. Here is the fields of the new entry: Name : Google ( or whatever you want ) Keyword : Google ( or whatever you want ) Search Pattern : http :// www . google . com / search ? q =% s Here is a more detailed information about Google's search URL. Add whatever you need, but keep it minimal, in case you screwed up with Vimium again :-) https://moz.com/ugc/the-ultimate-guide-to-the-google-search-parameters","tags":"vim","title":"Vimium Not Working in Google Search Results Page"},{"url":"http://jhshi.me/2013/03/15/console-input-messed-up-in-os161/index.html","text":"When you finished the process system call (e.g., fork , execv ) and test your system call by executing some user program, you'll probably find that the console input behavior is messed up. For example, when you executing user shell from OS161 kernel menu, and then executing /bin/true from the shell, you may see this OS/161 kernel [ ? for menu ] : s Operation took 0.000285120 seconds OS/161 kernel [ ? for menu ] : ( program name unknown ) : Timing enabled. OS/161 $ /bin/true ( program name unknown ) : bntu: No such file or directory ( program name unknown ) : subprocess time : 0.063300440 seconds Exit 1 In this case, the shell program only receive the input \"bnut\" instead of your input ( /bin/true ). To find out why, we need to dig into how kernel menu ( $OS161_SRC/kern/startup/menu.c ) works a little bit. When you hit \"s\" in the kernel menu. What happens? cmd_dispatch will look up the cmd_table and call cmd_shell cmd_shell just call common_prog with the shell path argument common_prog will first create a child thread with the start function cmd_progthread , then return In the child thread, cmd_progthread will try to run the actual program (in our case, the shell) Note that the shell program is run in a separate child thread, and the parent thread (i.e., the menu thread) will continue to run after he \"forked\" the child thread. So now there are actually two thread that want to read console input, which leads to race condition. This is why the shell program receive corrupted input: the menu thread eaten some of the inputs! To solve this problem, we need to let the menu thread wait for the child thread to complete, then return. So what we need to do is in common_prog , we need to do a waitpid operation after we call thread_fork . And at the end of cmd_progthread , we need to explicitly call exit with proper exit code in case the user program doesn't do this. Also note that waitpid and exit are in fact user land system call, and we can not directly call them in kernel, so you may need to make some \"shortcuts\" in your system call implementation to let the kernel be able to call sys_waitpid and sys_exit .","tags":"os161","title":"Console Input Messed up in OS161"},{"url":"http://jhshi.me/2013/02/27/use-ant-exec-task-for-linux-shell-commands/index.html","text":"Suppose we use cscope and/or ctags for indexing source code of our Java project and we want to update the meta data files (e.g. cscope.out, tags) each time after we compile. We can use the --post-comile target to accomplish this. Create a custom_rules.xml in your project root directory with the following content. This file will be included to your main build.xml file. <?xml version=\"1.0\" encoding=\"UTF-8\"?> <project> <target name= \"-post-compile\" > <exec executable= \"find\" failonerror= \"true\" > <arg line= \" . -name *.java\" /> <redirector output= \"cscope.files\" /> </exec> <exec executable= \"cscope\" failonerror= \"true\" > <arg line= \"-RUbq\" /> </exec> <exec executable= \"ctags\" failonerror= \"true\" > <arg line= \"-R .\" /> </exec> </target> </project> Here we create one task, namely exec task, to execute our commands. Pay special attention to our first command, find . More specifically, how we redirect the output here. The normal bash redirect symbol > doesn't not work here. Reference: http://ant.apache.org/manual/using.html http://ant.apache.org/manual/Tasks/exec.html http://ant.apache.org/manual/Types/redirector.html","tags":"Android","title":"Use Ant Exec task for Linux Shell Commands"},{"url":"http://jhshi.me/2013/02/26/eclim-e218-when-open-a-file-in-new-tab/index.html","text":"In the directory sub window, when I use T to open a file in new tab, the following error message will occur: No matching autocommands Error detected while processing function eclim # project # tree # ProjectTree .. eclim # project # tree # ProjectTreeOpen .. eclim # display # window # VerticalToolWindowOpen : line 78 : E218 : autocommand nesting too deep Error detected while processing function 53 _OpenFile .. eclim # tree # ExecuteAction : line 12 : E171 : Missing : endif Error detected while processing function 53 _OpenFile : line 8 : E171 : Missing : endif To fix this, apply the following patch to $HOME/.vim/eclim/plugin/project.vim described in here","tags":"errors","title":"Eclim E218 When Open a File in New Tab"},{"url":"http://jhshi.me/2013/02/21/using-cscope-inside-vim/index.html","text":"The goal we want to accomplish here is, jumping to a function definition (maybe in another file,) finding out where a symbol is defined, finding out what function(s) call current function and what functions are called from this function, ALL WITHOUT LEAVING VIM. First, make sure you have cscope installed by issuing the following command: $ cscope --version If bash complains \"command not find\", then install cscope . In Ubuntu, the command is: $ sudo apt-get install cscope Then, we need to generate cscope database. If you're dealing with C files, then in the root directory of the source tree, using this command: $ cscope -RUbq If you're dealing with Java files, before generating the database, we need to tell cscope tracing which files: $ find . -name *.java > cscope.files $ cscope -RUbq The explanations are: -R: Recurse subdirectories during search for source files. -U: Check file time stamps. This option will update the time stamp on the database even if no files have changed. -b: Build the cross-reference only. We don't want the interactive mode. -q: Enable fast symbol lookup via an inverted index For more details, consult the cscope manual: $ man cscope After this step, several cscope database files will be generated. If you're using git or hg to manage your code, you may want to ignore them in the git/hg repository. Do that by adding these lines into your .gitignore/.hgignore cscope.* Then we need to tell Vim how to interact with cscope . Add the following lines into your .vimrc : if has ( \"cscope\" ) set csprg = /usr/bin/cscope set csto = 0 set cst set csverb \" C symbol nmap <C-\\>s :cs find s <C-R>=expand(\" < cword > \")<CR><CR> \" definition nmap < C - \\ > g : cs find g < C - R >= expand ( \"<cword>\" ) < CR >< CR > \" functions that called by this function nmap <C-\\>d :cs find d <C-R>=expand(\" < cword > \")<CR><CR> \" funtions that calling this function nmap < C - \\ > c : cs find c < C - R >= expand ( \"<cword>\" ) < CR >< CR > \" test string nmap <C-\\>t :cs find t <C-R>=expand(\" < cword > \")<CR><CR> \" egrep pattern nmap < C - \\ > e : cs find e < C - R >= expand ( \"<cword>\" ) < CR >< CR > \" file nmap <C-\\>f :cs find f <C-R>=expand(\" < cfile > \")<CR><CR> \" files # including this file nmap < C - \\ > i : cs find i &#94;< C - R >= expand ( \"<cfile>\" ) < CR > $ < CR > \" Automatically make cscope connections function! LoadCscope() let db = findfile(\" cscope . out \", \" .; \") if (!empty(db)) let path = strpart(db, 0, match(db, \" / cscope . out$ \")) set nocscopeverbose \" suppress 'duplicate connection' error exe \"cs add \" . db . \" \" . path set cscopeverbose endif endfunction au BufEnter /* call LoadCscope () endif We're done! Now using Vim to edit a source code file. Put the cursor on a symbol (variable, function, etc.), First press Ctrl+\\ , then press: s : find all appearance of this symbol g : go to the definition of this symbol d : functions that called by this function c : functions that calling this function For more details about cscope , inside Vim, press :h cs to see the help message of cscope .","tags":"vim","title":"Using Cscope INSIDE Vim"},{"url":"http://jhshi.me/2012/09/18/lfs-6-9-1-command-substitution-line-3-syntax-error-near-unexpected-token/index.html","text":"I encountered this error when compiling glibc. The apparent cause is that bash can not deal with brackets correctly. So even a simple command like echo $(ls) will fail with the same error (command substitution). The most suspicious cause is that when compile bash in section 5.15.1, I use byacc for walk around when the compiler complained the absence of yacc . Bash uses yacc grammer rules and only GNU bison will generate the correct parsing code for the bash build . So I un-installed byacc and installed bison. Then Make a soft link at /usr/bin/yacc to bison Recompile all the package after 5.10 (tcl) and before 5.15 (include 5.15) Test if problem solved using echo $(ls) command If yes, then using /tools/bin/bash --login +h to lunch the new bash Also see: http://www.mail-archive.com/lfs-support@linuxfromscratch.org/msg16549.html http://unix.stackexchange.com/questions/28369/linux-from-scratchs-bash-problem-syntax-error","tags":"errors","title":"LFS 6.9.1: command substitution: line 3: syntax error near unexpected token  `)'"},{"url":"http://jhshi.me/2012/09/08/change-gccs-stack-protection-option-in-lfs/index.html","text":"In Chapter 5.5 , there is one step that fixes the GCC's stack protection detection problem. The command is: sed -i '/k prot/agcc_cv_libc_provides_ssp=yes' gcc/configure This command seems weird to me at first glance. After digging a little more about sed command, it's intention is much clear. -i means change the file (i.e., gcc/configure ) in place /k prot/ is the pattern. If you look at gcc/configure , you'll find a line (around 26695) of comment that says: # Test for stack protector support in target C library And you'll see that this is the only occurrence of \"stack protector\" (as well as k prot . I think we'd better use /stack protector/ as the pattern for easy understanding. a means append a line after the line that contains the pattern. ( sed document ) gcc_cv_libc_provides_ssp=yes is the actual line being appended.","tags":"linux","title":"LFS 5.5.1: Change GCC's Stack Protection Option"},{"url":"http://jhshi.me/2012/07/11/use-rsync-and-cron-to-do-regular-backup-part-ii/index.html","text":"Now that we can take advantage of rsync to minimize the data to transfer when backup . But it's still a little uncomfortable if we need to do this manually everyday, right? Well, cron is here to solve the pain. Cron is kind of a system service that automatically do some job as you specified. Backup, for example, is a perfect kind of job that we can count on cron. First, we need to specify a job that we want cron to do. In my case, I want cron to automatically sync my source tree folder on remote data center and my local backup folder. A simple rsync command seems meet my need. But actually, there are more to consider: I don't want to copy the obj files, since they are normally large in size and change frequently, but can be easily re-generated. But I also don't want to skip the entire build folder when do rsync since there are some configure files in there. The backup process should be totally automated. More specifically, no password is needed when do rysnc. Towards the first need, I can use ssh to send remote command to do necessary clean up work before rysnc. And the second need can be meted according to my previous post about ssh/scp without password . So my final backup script looks like this: #!/bin/sh # ~/backup.sh LOG_FILE = ~/backup.log SOURCE_DIR = b@B:~/src/ TARGET_DIR = ~/src_backup date >> $LOG_FILE echo \"Synchronization start...\" >> $LOG_FILE ssh b@B ' cd ~/src/build ; make clean ; rm -rf obj/ \" >> $LOG_FILE rsync -avz --exclude \" tags \" $SOURCE_DIR $TARGET_DIR >> $LOG_FILE echo \" Synchronization done \" >> $LOG_FILE Once we figure out what to do, we need to tell cron about our job. The configure file of cron is /etc/crontab . A job description is like follows: # Example of job definition: # .----------------minute (0 - 59) # | .------------- hour (0 - 23) # | | .---------- day of month (1 - 31) # | | | .------- month (1 - 12) OR jan,feb,mar,apr ... # | | | | .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat # | | | | | # * * * * * user-name command to be executed 0 0 * * * jack ~/backup.sh I want to do backup every day on midnight so I set the minute and hour both to 0. The asterisk ( * ) symbol in day/month means any valid values. Now we are done. The back up process is completely automated and scheduled. Reference : http://myhowtosandprojects.blogspot.hk/2008/07/sincronize-folders-with-rsync-using-ssh.html","tags":"linux","title":"Use rsync and cron to do regular backup (Part II)"},{"url":"http://jhshi.me/2012/07/11/use-rsync-and-cron-to-do-regular-backup-part-i/index.html","text":"Recently I do most of my work on a remote data center through a slow network connection (<100KB/sec). I usually backup my project source tree as follows. I first do make clean and also delete any unnecessary obj files to shrink the total file size, then I compress the whole source tree as a tar ball and then I use scp locally to fetch the backup tar ball to my local machine. The procedure is quite boring since I need to do this every day before I go home, otherwise the whole bandwidth will be occupied for near an hour during which I can almost do nothing. Situation gets better when I find rsync and cron . Here is how I do automatic regular (daily) backup with them. Rsync is a file synchronization tool that aims to minimize the data transfer during copy files. This is done via only send the diffs to destination. It is perfect when you need to do regular copy between two fixed locations. Rsync has many options (well, as most of other GNU tools), here is two of them that are used more frequently: # ensure that symbolic links, devices, attributes, permissions, # ownerships, etc are preserved in the transfer -a, --archive #compress data during transfer, especially useful when the bandwidth is limited -z, --compress # exclude the directories or files that you don't want to sync, such as obj # files, tag files, etc --exclude Suppose that you have a source tree on host B: ~/src , and you want to sync this source tree with a local folder named: ~/src_backup , then the follow command will suffice: $ rsync -avz --exclude \"obj/\" --exclude \"tags\" --exclude \"build\" b@B:~/src/ ~/src_backup The two exclude option will tell rsync to skip the obj subdirectory as well as the tags file. The trailing slash in the source ( b@B:~/src/ ) will tell rsync not to create an additional directory level at the destination. Without this slash, rsync will create a src directory under ~/src_backup , which is not desirable. Now that after the first time rsync, the following rsync commands will only transfer the file changes to local, which is a great save of the bandwidth.","tags":"linux","title":"Use rsync and cron to do regular backup (Part I)"},{"url":"http://jhshi.me/2012/07/11/dropbox-unable-to-monitor-filesystem/index.html","text":"Sometime this error occurs that says: \"Unable to monitor file system. Please run: echo 100000 | sudo tee /proc/sys/fs/inotify/max_user_watches and restart Dropbox to correct the problem. We need to adjust the system setting on the maximum file number that Dropbox can watch. The following command will solve your pain: $ echo fs.inotify.max_user_watches = 100000 | sudo tee -a /etc/sysctl.conf ; sudo sysctl -p Here is the tip from Dropbox website .","tags":"errors","title":"Dropbox: Unable to monitor filesystem"},{"url":"http://jhshi.me/2012/07/11/ssh-error-agent-admitted-failure-to-sign-using-the-key/index.html","text":"If you follow my previous post about ssh/scp without password , but you got this error when you try to ssh to B on A, then you need to add RSA or DSA identities to the authentication agent. A ssh-add command on host A will solve your pain. $ ssh-add # Sample output Identity added: /home/jack/.ssh/id_rsa ( /home/jack/.ssh/id_rsa ) Reference http://www.cyberciti.biz/faq/unix-appleosx-linux-bsd-agent-admitted-failuretosignusingkey/","tags":"errors","title":"ssh error: Agent admitted failure to sign using the key"},{"url":"http://jhshi.me/2012/06/22/specify-graphicspath-in-latex/index.html","text":"We can use the graphicx package together with the graphicspath command to specify the looking up path for pictures. A typical structure may look like this: \\usepackage { graphicx } % Must use this command BEFORE you begin document! \\graphicspath {{ pic _ path1/ }{ pic _ path2 }} \\begin { document } % some content \\end { document } As you can see, the syntax of graphicspath command is very simple. You just enclose your picture path, either relative to current work path, or absolute path, in a pair of curly braces. Note that you must place this command before you begin document otherwise it will take no effect. Please refer to this page for more details about graphicspath command.","tags":"latex","title":"Specify graphics path in Latex"},{"url":"http://jhshi.me/2012/05/07/use-trap-to-do-cleanup-work-when-script-terninates/index.html","text":"Now I have the script that monitoring the output of several UART devices: #!/bin/bash for i in ` seq 0 7 ` ; do # use grep here to enforce line-buffered output, so concurrent # input from UART isn't messed up cat /dev/crbif0rb0c ${ i } ttyS0 | grep &#94; --line-buffered & done wait But there is one problem, when you terminate the script ( ctrl+c ), these cat processes won't be killed, so that the next time you run this script, you'll not be able to access these UART device since they are busy. To solve this problem, we need to do some cleanup work when the script terminates. In this case, we need to kill these cat processes. We can use the trap command to do this. Basically, trap enables you to register a kind of handler for different kind of signals . In this case, we can add a line into the script: trap \"pkill -P $$ \" SIGINT for i in ` seq 0 7 ` ; do # use grep here to enforce line-buffered output, so concurrent # input from UART isn't messed up cat /dev/crbif0rb0c ${ i } ttyS0 | grep &#94; --line-buffered & done wait $$ is the process id of the script. pkill -P $$ will kill all the child processes of $$ . So that when the script terminates ( SIGINT signal from ctrl+c ), this pkill command will be executed and all the cat processes will be killed. Thanks to these post. http://www.davidpashley.com/articles/writing-robust-shell-scripts.html#id2564782","tags":"linux","title":"Use trap to Do Cleanup Work When Script Terminates"},{"url":"http://jhshi.me/2012/05/04/line-buffered-cat/index.html","text":"I'd like to watch the output of a UART device in Linux, and I only want to see the content when there are a whole line. So I prefer some kind of line-buffered cat such as: $ cat --line-buffered /dev/crbif0rb0c0ttyS0 But unfortunately, cat doesn't have a line-buffered option. And fortunately, GNU grep has such an option. So we can do $ cat /dev/crbif0rb0c0ttyS0 | grep &#94; --line-buffered Since every line has a &#94; (line start), so each line matches the grep . Note that I ever tried $ cat /dev/crbif0rb0c0ttyS0 | grep . --line-buffered But this does not work. Only empty lines are printed, and I don't know why...","tags":"linux","title":"Line Buffered Cat"},{"url":"http://jhshi.me/2012/05/02/os161-same_stack-check-fail-in-trap/index.html","text":"There are several SAME_STACK asserts in $OS161_SRC/kern/arch/mips/locore/trap.c to ensure that current thread did not run out of kernel stack . A typical assert may looks like: KASSERT ( SAME_STACK ( cpustacks [ curcpu -> c_number ] - 1 , ( vaddr_t ) tf )) The purpose of SAME_STACK assertion In OS161, each thread has its own kernel stack. When interrupts or exceptions occur, the CPU will first switch to current thread's kernel stack, both to avoid polluting user's normal stack, and protect the stack from malicious user program. The stack is allocated in thread_fork and in cpu_create (but not both). The initial stack size is defined in $OS161_SRC/kern/include/thread.h as STACK_SIZE . Since stack grows downwards, to check if we run out of the stack, we put a few magic values at the bottom of the stack ( thread_checkstack_init ), so that we can check if the values are the same with what we filled it ( thread_checkstack ) to see if we run out of kernel stack. In $OS161_SRC/kern/arch/mips/locore/trap.c , there are a few SAME_STACK assertions to make sure the trap frame at the right place. Why would we run out of kernel stack? Remember that any variables you define in your syscall functions are allocated in current thread's kernel stack. So if you allocated large variables, such as a big array buffer, you'll probably have a stack \"downflow\". So, either try to shrink your declared buffer size, or use kmalloc instead. Or, you can enlarge the stack size to temporally solve your pain, but this is not recommended since each thread will have a stack, if it's too large, then you'll soon run out of physical memory if you have lots of threads. Problem of the macro During the lab, I sometimes fail this assert. At first, I thought I've run out of kernel stack so I enlarge the STACK_SIZE to 16 KB. But I still fail this assert after that. Then I take a look at the definition of the SAME_STACK macro: #define SAME_STACK(p1, p2) (((p1) & STACK_MASK) == ((p2) & STACK_MASK)) I found this macro problematic. Suppose STACK_SIZE = 0X00004000 , then STACK_MASK = ~(STACK_SIZE-1) = 0XFFFFC000 . Assume p1 (stack top) = 0X80070FFF , p2 (stack pointer) = 0x8006FFFF , then we've only used 0x00001000 bytes stack but SAME_STACK macro will fail, since p1 & STACK_MASK = 0X80070000, p2 & STACK_MASK = 0X8006C000. The point here is the stack top address may not be STACK_SIZE aligned. So we can not do the same stack check by simply checking their base addresss. So we need to modify this part to get our kernel work. This is not your fault but probably a bug shipped with the kernel. You can use any tricky macros here but a simple pair of comparison will be suffice. KASSERT ((( vaddr_t ) tf ) >= (( vaddr_t ) curthread -> t_stack )); KASSERT ((( vaddr_t ) tf ) < (( vaddr_t ) curthread -> t_stack + STACK_SIZE ));","tags":"os161","title":"OS161 SAME_STACK Check Fail in Trap"},{"url":"http://jhshi.me/2012/05/02/os161-duplicated-tlb-entries/index.html","text":"Sys161 will panic if you try to write a TLB entry with a entryhi , but there are already a TLB entry with the same entryhi but in a different TLB slot. This is because entryhi should be a UNIQUE key in the TLB bank. When you want to update a TLB entry (e.g., shoot down a TLB entry, or set the Dirty bit, etc.), you need to first use tlb_probe to query the TLB bank to get the TLB slot index and then use tlb_read to read the original value, and then use tlb_write to write the updated TLB entry value to this slot. But what if there is a interrupt after you tlb_probe but before tlb_read ? Chance maybe that the TLB bank is totally refreshed so that you read a stale value and also write a stale value. Things get totally messed up and errors such as \"Duplicated TLB entries\" may occur. To resolve this, you need to protect your whole \" tlb_probe -> tlb_read -> tlb_write \" flow and make sure that this flow won't get interrupted. So you really want to disable interrupt ( int x = splhigh() ) before you do tlb_probe and re-enable it ( splx(x) ) after tlb_write . Alternatively, you can also use a spin lock to protect your access to TLB.","tags":"os161","title":"OS161 Duplicated TLB entries"},{"url":"http://jhshi.me/2012/04/28/os161-swapping/index.html","text":"Now that you can allocate/free physical pages , and you have demand paging through handling TLB miss . Let's get the final part work: swapping. UPDATE (2016-04-26) You should only use the disk to store the swapped pages. Three basic operations for a physical page The first is called evict . In a nutshell, evict a physical page means we modify the page table entry so that this page is not Present ( PTE_P ), but Swapped ( PTE_S ). And we also need to shoot down the relative TLB entry. But in evict , we will not write the page's content to disk. Apparently, evict can only operate on clean pages . The second operation is swapout . We first write this page's content to disk, which makes the page from dirty to clean. And the we just evict it. swapout operation is for dirty pages . The last operations is swapin . Basically, it's for read some virtual page from swap disk and place it in some physical page, and also need to modify the relevant page table entry, making this page Present ( PTE_P ) instead of Swapped( PTE_S ). How to store the swapped pages By default, sys161 provide two disks through lamebus, i.e., lhd0 and lhd1 . If you want to store the pages in the raw disk, you should open the swap space. Note that the file name must be lhd0raw: or lhd1raw and the open flag must be O_RDWR , since the disk is there, and needn't to be created or trunked. Update : Actually, I didn't realize that we can actually change the RPM of the disks to make swapping faster then write to emufs . So my suggestion would be: use disk to store swapped pages and set RPM to a large enough value in sys161.conf (e.g., 28800). For the same reason why we can not open consoles in thread_create , you can not do this in vm_bootstrap since at that point, VFS system was not initialized yet (see $OS161_SRC/kern/startup/main.c for boot sequence, especially line 125~130). But it's OK, we can open the file until we really need to write pages , e.g., when we swap out the first page. We'll leverage the file operation to manipulate swapped pages. You may want to review the file operation system calls to get familiar with VFS operations. We use a lot uio_kinit and VOP_READ / VOP_WRITE here. But before all these, we need to first create a swap file. We also need some data structure to record each page's location in the swap file. This data structure should be something like a map. The key is (address space, va) pair, and the value is the location of the page. As usual, for simplicity, we can just use a statically allocated array. Each array element contains the (address space, va) pair, and this element's index is the page's location . Of course, we need to set up a limit of maximum swapped page number if we adopt this silly manner. When swap out a page, we first look up this array (by comparing as and va ) and find out if the swap file has already contain a copy of this page, if yes then we directly overwrite that page and if no, we just find a available slot and write the page to that slot. A important note is that you want to create ONE swap file for all process , instead of one swap file for each process. Since by doing the later, you also have to allocate a mapping structure for each process and you'll run out of memory very quickly (kernel pages are fixed, right?). Now the swap file and the mapping data structure is a shared resource among all processes. So you need to protect them with a lock. Two I/O operations on the swap disk These two operations is quite straightforward. The first is called write_page , which is responsible to write a page content to a specified location of the swap file . The second is read_page , which is to read a specified page in the swap file and copy the content to a physical page . We do not necessarily have to have these to utility functions but it's always good to abstract low level operations and encapsulate to a convenient interface. The Swapping Work flow In your paging algorithm, you certainly will first look for free physical pages. But once you fail to find such a page, you have to swap some page out to get a free page. That's what the magic function MAKE_PAGE_AVAIL do in my previous post about physical page allocation . Now let's take a look at the magic function. Denote the page that was to swapped out as victim. If it's state is PAGE_STATE_CLEAN , it means that this page already have a copy in disk and was not ever modified since swapped in. So we can safely discard it's content. We use the evict operation to deal with it. And after that, this page is available. If this page is dirty, which means either this page do not have a copy in swap file or this page was modified since it's swapped in, in both case, we need to write its content to swap file. We can use the swapout operation here. In vm_fault with fault type VM_FAULT_READ or VM_FAULT_WRITE , when we find that this page is not Present ( PTE_P ), instead of allocate a new page for it, we need to further check if this page was swapped ( PTE_S ), if yes then we need to swap it in, if no then we can allocate a new physical page for it.","tags":"os161","title":"OS161 Swapping"},{"url":"http://jhshi.me/2012/04/27/os161-tlb-miss-and-page-fault/index.html","text":"Now we've set up user address space, it's time to handle TLB/page faults. Note that there is a difference between TLB and page faults: TLB fault means the hardware don't know how to translate a virtual address since the translation didn't present in any TLB entry. So the hardware raise a TLB fault to let the kernel decide how to translate that address. Page fault means the user program tries to access a page that is not in memory, either not yet allocated or swapped out. TLB Entry Format In sys161, which simulates MIPS R3000, there are totally 64 TLB entries. Each entry is a 64-bit value that has the following format: Section 18.6 of this document contains a detailed description of the meaning of each bits. But briefly, VPN (abbr. for Virtual Page Frame Number) is the high 20 bits of a virtual address and PPN is the high 20 bits of a physical address space. When Dirty bit is 1, it means this page is writable, otherwise, it's read-only. When Valid bit is 1, it means this TLB entry contains a valid translation. In OS161, we can just ignore the ASID part and Global bit, unless you really want to do some tricks such as multiplex TLB among processes instead of just shoot down all TLB entries when context switch. Also, we can ignore the NoCache bit. TLB Miss Type When translation a virtual address, the hardware will issue a parallel search in all the TLB entries, using the VPN as a search key. If the hardware failed to find a entry or find a entry but with Valid bit is 0, a TLB Miss will be issued. The miss type could be VM_FAULT_READ or VM_FAULT_WRITE , depending on whether it's a read or write operation. On the other hand, if it's a write operation and hardware find a valid TLB entry of VPN, but the Dirty bit is 0, then this is also a TLB miss with type VM_FAULT_READONLY . If none of above cases happen, then this is a TLB hit, everybody is happy :-) TLB Manipulate Utils Before we discuss how to handle a TLB fault. We first take a look at how to manipulate the TLB entries. The functions that access TLB can be found at $OS161_SRC/kern/arch/mips/include/tlb.h . Four routines are provided. And the comments there are quite clear. We use tlb_probe to query the TLB bank, and use tlb_read / tlb_write to read/write a specific TLB entry, and use tlb_random to let the hardware decide which entry to write to. Finally, handle TLB Miss On a TLB fault, the first thing to do is to check whether the faulting address is a valid user space address. Since it's possible that the fault is caused by copyin / copyout , which expect an TLB fault. So what's an \"valid\" user space address? User code or date segment User heap, between heap_start and heap_end User stack If the address is invalid, then we directly return some non-zero error code, to let the badfault_func capture the fault. For VM_FAULT_READ or VM_FAULT_WRITE , we just walk current address space's page table, and see if that page actually exists (by checking the PTE_P bit). If no then we just allocate a new page and modify the page table entry to insert the mapping (since we haven't turn on swap yet, so not exist means this is the first time we access this page ). The permissions of the newly allocated page should be set according to the region information we stored in struct addrspace . Finally we just use tlb_random to insert this mapping to TLB. Of course, you can adopt some TLB algorithm here that choosing a specific TLB victim. But only do this when you have all your VM system working. For VM_FAULT_READONLY , this page is already in memory and the mapping is already in TLB bank , just that the Dirty bit is 0 and user try to write this page. So we first check if user can really write this page , maybe by the access bits in the low 12 bits of page table entry. (Recall that in as_define_region , user passed in some attributes like readable, writable and executable. You should record them down there and use them to check here). If user want to write a page that he has no rights to write, then this is a access violation. You can just panic here or more gracefully, kill current process. But if user can actually write this page, then we first query TLB bank to get the index of the TLB entry, set the Dirty bit of entrylo and write it back use tlb_write . Don't forget to change the physical page's state to PAGE_STATE_DIRTY (It's useless now but will be useful in swapping) The above are pretty much what vm_fault does. Three extra tips: Since TLB is also a shared resource, so you'd better use a lock to protect the access to it . And it's better be a spinlock since sometimes we perform TLB operations in interrupt handler, where we don't want to sleep. Do not print anything inside vm_fault . kprintf may touch some of the TLB entry so that the TLB has been changed between the miss and vm_fault , which can lead to some really weird bugs. Assumption is the source of all evil. Use a lot KASSET to make your assumption explicit and check if they are right.","tags":"os161","title":"OS161 TLB Miss and Page Fault"},{"url":"http://jhshi.me/2012/04/27/os161-sbrk-system-call/index.html","text":"If you're not familiar with sbrk system call, here is it's wiki , and its interface description . In a nutshell, malloc will use sbrk to get heap space. In as_define_region , we've find the highest address that user text and data segment occupy, and based on this, we've set the heap_start in struct addrspace . This makes the sbrk system call implementation quite easy: almost just parameter checking work. Several points: inc could be negative, so make sure heap_end+inc >= heap_start Better to round up inc by 4. This is optional but can lower the chance of unaligned pointers After all these checking, just return heap_end as a void* pointer and increase heap_end by inc . Of course, like any other system calls, you need to add a case entry in the syscall function.","tags":"os161","title":"OS161 sbrk System Call"},{"url":"http://jhshi.me/2012/04/27/sshscp-without-password/index.html","text":"Suppose you have two machines: A and B. A is your work machine, you do most of your work on it. But B is a little special (e.g., connected to some specific hardware) that you need to ssh on it or copy some file from A to B from time to time. Here is the way that you can get rid of entering passwords every time you do ssh/scp. First, on machine A, generate a DSA key pair: $ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key ( YOUR_HOME/.ssh/id_rsa ) : # press ENTER here to accept the default filename Enter passphrase ( empty for no passphrase ) : # press ENTER here to use no passphrase, otherwise, you still need # to enter this passphrase when ssh Enter same passphrase again: # press ENTER here Your identification has been saved in $HOME /.ssh/id_rsa. Your public key has been saved in $HOME /.ssh/id_rsa.pub. The key fingerprint is: ..... ( omited ) Then, change the access mode of .ssh directory $ chmod 775 ~/.ssh Then append the content of your just generated id_rsa.pub to the $HOME/.ssh/authorized_keys file on machine B: # copy the id_rsa.pub file to host B $ scp ~/.ssh/id_rsa.pub b@B:. # login to B $ ssh b@B # append the content to authorized_keys $ cat id_rsa.pub >> .ssh/authorized_keys Finally, ssh on to B and change the access mode of the file authorized_keys . This is optional, maybe you don't need to do this if you can already ssh without entering password. $ ssh b@B $ chmod 700 .ssh $ chmod 640 ~/.ssh/authorized_keys Depend on your version of ssh, you may also need to do the following: $ ssh b@B $ cp ~/.ssh/authorized_keys ~/.ssh/authorized_keys2 That it! Enjoy! Reference http://www.cyberciti.biz/faq/ssh-password-less-login-with-dsa-publickey-authentication/ http://www.linuxproblem.org/art_9.html","tags":"linux","title":"ssh/scp without password"},{"url":"http://jhshi.me/2012/04/24/os161-user-address-space/index.html","text":"Now we've set up our coremap and also have the routines to allocate and free physical pages. It's the time to set up user's virtual address space. Basically, we'll adopt two-level page table . If you're not already familiar with this, you can check out the page table wiki and this document talking about MIPS and X86 paging . The page table entry format will be much alike those in X86. For a page directory entry, the upper 20 bits indicates the base physical address of the page table, and we use one bit in the lower 12 bits to indicate whether this page table exist or not . For a page table entry, the upper 20 bits stores the base physical address of the actual page, while the lower 12 bits contain some attribute of this page, e.g., readable, writable, executable, etc. You are free to define all these (format of page directory and page table entry) though, since the addressing process are totally done by software in MIPS, but following the conventions is still better for compatibility as well as easy programming. What to store in the addrspace structure? An address space is actually just a page directory : we can use this directory and page table to translate all the addresses inside the address space. And we also need to keep some other information like user heap start, user heap end, etc. But that's all, and no more. So in as_create , we just allocate a addrspace structure using kmalloc , and allocate a physical page (using page_alloc ) as page directory and store it's address (either KVADDR or PADDR is OK, but you can just choose one). Besides, we need to record somewhere in the addrspace structure the valid regions user defined using as_define_region , since we're going to need that information during page fault handing to check whether the faulted address is valid or not. Address Translating with pgdir_walk This is another most important and core function in this lab. Basically, given an address space and virtual address, we want to find the corresponding physical address. This is what pgdir_walk does. We first extract the page directory index (top 10 bits) from the va and use it to index the page directory, thus we get the base physical address of the page table. Then we extract the page table index (middle 10 bits) from va and use it to index the page directory, thus we get the base physical address of the actual page. Several points to note: Instead of return the physical address, you may want to return the page table entry pointer instead. Since in most cases, we use pgdir_walk to get page table entries and modify it We'll also need to pass pgdir_walk a flag, indicating that whether we want to create a page table if non-exist (remember the present bit of page directory entry?). Since sometimes, we want to make sure that a va is mapped to a physical page when calling pgdir_walk . But most of the time, we just want to query if a va is mapped. Think clearly about which is physical address, and which is virtual address. Page directory entry and page table entry should store the physical address base. You'll need a lot PADDR_TO_KVADDR here. Copy address space using as_copy This part is easy if you decide not support Copy-On-Write pages. Basically, you just pgdir_walk old address space's page table, and copy all the present pages. Only one point, don't forget to copy all the attribute bits (low 12 bits) of the old page table entry . You'll get some extra work when you enable swapping: you need to copy all the swapped pages beside present pages as well. Destroy address space with as_destroy Same easy as as_copy , just pgdir_walk the page table and free all the present pages. Also same with as_copy , you need to free the swapped pages latter Define regions using as_define_region Since we'll do on-demand paging , so we won't allocate any pages in as_define_region Instead, we just walk through the page table, and set the attribute bits accordingly. One point, remember the heap_start and heap_end field in struct addrspace ? Question: where should user heap start? Immediately after user bss segment! And how would we know the end of user bss segment? In as_define_region ! So each time in as_define_region , we just compare addrspace's current hew and the region end, and set the heap_start right after ( vaddr+sz ). Don't forget to _proper align the heap_start`(by page bound)_, of course. This should also be the place we record each region information (e.g., base, size, permission, etc) so that we can check them in vm_fault . Miscellaneous In as_activate , if you don't use the ASID field of TLB entry, then you can just shoot down all the tlb entries. It's the easiest to way to go. In as_prepare_load , we need to change each regions' page table permision as read-write since we're going to load content (code, date) into them. And in as_complete_load , we need to change their page table permissions back to whatever the original value. In as_define_stack , we just return USERSTACKTOP .","tags":"os161","title":"OS161 User Address Space"},{"url":"http://jhshi.me/2012/04/24/os161-physical-page-management/index.html","text":"We'll talk about page_alloc , page_free , alloc_kpages and free_kpages . Allocate one single physical page with page_alloc This is relatively easy once you've decided which paging algorithm to use. FIFO seems good enough in in terms of simplicity as well as acceptable performance. We just scan the coremap, find out if there is any FREE page, or find out the oldest page. At this stage (before swapping), I will use a magic function called MAKE_PAGE_AVAIL , which obviously makes a page available, by flushing or swapping, we don't care :-). Once we find a victim (maybe free, clean, or dirty, but must not be fixed ), we call MAKE_PAGE_AVAIL on it, and update it's internal fields like time stamp, as , va , etc. And don't forget to zero the page before we return. A trade-off here is what parameters should we pass to page_alloc ? One choice is nothing: I just tell you to give me a page, and I'll deal with the page meta-info by myself. But this manner will probably cause page-info inconsistency, e.g., caller forget to set page's state. So to avoid this case, I prefer caller tell page_alloc all he needs, like as , va , whether the allocate page need to keep in memory, etc. And let page_alloc set the page's meta info accordingly. BTW, since coremap is a globally share data structure, so you really want to use lock to protect it every time you read/write it. Allocate n continuous pages with page_nalloc Since kernel address will bypass TLB and are directly-mapped. (See this and this for details), when we're asked to allocate n (where n > 1) pages by alloc_kpages , we must allocate n continuous pages ! To do this, we need to first find a chunk of n available (i.e., not fixed) continuous pages, and then call MAKE_PAGE_AVAILABLE on these pages. Like page_alloc , we also need to update the coremap and zero the allocated memory. As mentioned in my previous blog about coremap , in alloc_kpages , we need to first check whether vm has bootstrapped : if not, we just use get_ppages , otherwise, we use our powerful page_nalloc . Also, we need to record how many pages we allocated so that when calling free_kpages , we can free all these npages page. Free a page with page_free and free_kpages We just need to mark this page as FREE. But if this page was mapped to user address space ( page->as != NULL ), then we need first unmap it, and shoot down the TLB entry if needed. We'll talk about user address space management lately. Only one tip for this part, do not forget to protect every access to coremap using lock (but not spinlock).","tags":"os161","title":"OS161 Physical Page Management"},{"url":"http://jhshi.me/2012/04/24/os161-coremap/index.html","text":"The first concern of OS161 virtual memory system is how to manage physical pages. Generally, we can pack a physical page's information into a structure (called struct coremap_entry ) and use this struct to represent a physical page . We use an array of struct coremap_entry to keep all physical pages information. This array, aka, coremap , will be one of the most important data structure in this lab. What should we store in coremap entry structure? For each physical page, we want to know: Where is this page mapped? (For swapping) What's this pages status? (free, fixed, clean, dirty...) Other info (e.g. need by paging algorithm) A page can have for different states, as shown below. This diagram is quite clear. Several points to note: When a physical page is first allocated, its state is DIRTY, not CLEAN. Since this page do not have a copy in swap file (disk). Remember that in a virtual memory system, memory is just a cache of disk. For some reason, we may want to always keep a certain page in memory, e.g. kernel pages, since these pages are direct mapped. user stack and code segment pages which we already knew will be frequently accessed. So we have a special state called \"fixed\", means that we'll never swap out these pages to disk . Coremap Initialization We need to initiate our coremap in vm_bootstrap . First, we need to find out how many physical pages in system. We can do this using ram_getsize . There is a big trick here. Since we will only know the physical page number, i.e. coremap array length at runtime, so we'd better just define a struct coremap_entry pointer and allocate the actually array at runtime after we got the physical page number, rather than use a statically defined array with some MACRO like MAX_PHY_PAGE_NUM . So at first glance, we may write: But the above code will definitly fail . Take a look at ram_getsize , we can see that this function will destroy its firstaddr and lastaddr before return. So after that, if we call kmalloc , which call alloc_kpage , get_ppages and ram_stealmem to get memory, ram_stealmem will fail. The contradiction is: we need to call ram_getsize to get physical page number so that we can allocate our coremap( pages ), but once we call ram_getsize we will not be able allocate any pages! To resolve this contradiction, on one hand, we should initialize all other data structures, e.g., locks, before we call ram_getsize . Then we call ram_getsize to get firstaddr and lastaddr . After that, instead of using kmalloc , we must allocate our coremap manually , without invoking any other malloc routines. A possible solution may be: Now we allocated our core map just between firstaddr and freeaddr , and [ freeaddr , lastaddr ] will be system's free memory. Then we initialize the coremap array, we need to mark any pages between [0, freeaddr ) as fixed, since this memory contains important kernel code and data, or memory mapped I/Os. And we just mark pages between [ freeaddr , astaddr ] as free. At the end of vm_bootstrap , we may want to set some flags to indicate that vm has already bootstrapped, since functions like alloc_kpages may call different routines to get physical page before and after vm_bootstrap .","tags":"os161","title":"OS161 Coremap"},{"url":"http://jhshi.me/2012/04/19/os161-virtual-memory-resources/index.html","text":"Here are various documents that I found helpful for implementing OS161 virtual memory system. These are two other blogs that also talking about VM of OS161: http://asmarkhalid.blogspot.com/ http://flounderingz.blogspot.com/ A very good document introducing MIPS TLB: http://pages.cs.wisc.edu/~remzi/OSFEP/vm-tlbs.pdf Lecture notes about MIPS TLB and paging. http://frankdrews.com/public_filetree/cs458_558_SQ10/Slides/mm.pdf http://people.csail.mit.edu/rinard/osnotes/h11.html http://people.csail.mit.edu/rinard/osnotes/h10.html A lecture note about MIPS stack and heap, helpful when implementing sbrk system call. http://www.howardhuang.us/teaching/cs232/04-Functions-in-MIPS.pdf","tags":"os161","title":"OS161 Virtual Memory Resources"},{"url":"http://jhshi.me/2012/03/28/os161-arguments-passing-in-system-call/index.html","text":"One principle of kernel programming is that: do not trust anything users passed in . Since we assume that users are bad, they will do anything they can to crash the kernel (just as $OS161_SRC/user/testbin/badcall/badcall.c do). So we need pay special attention to the arguments of the system calls, especially the pointers . $OS161_SRC/kern/vm/copyinout.c provides several useful facilities to safely copy user level arguments into kernel or vice versa. They assure that even if user arguments is illegal, the kernel can still get control and handle the error, instead of just crash . So let's see how can they be applied in the system calls. User space strings Some system call, e.g. open , chdir , execv , requires a user level string as arguments. We can use copyinstr to do this. See the prototype of copyinstr : int copyinstr ( const_userptr_t usersrc , char * dest , size_t len , size_t * actual ) const_userptr_t is just a signpost that make usersrc explicitly looks like a user pointer. So basically, this function copies a \\0 terminated user space string into kernel buffer dest , and copy as much as len bytes, and return the actual bytes copied in actual . Note that copyinstr will also copy the last \\ 0 . Suppose we have a function that takes a user space string as argument. int foo ( char * name ) { char kbuf [ BUF_SIZE ]; int err ; size_t actual ; if (( err = copyinstr (( const_userptr_t ) name , kbuf , BUF_SIZE , & actual )) != 0 ) { return err ; } return 0 ; } Then if we call foo(\"hello\") , on success, actual will be 6, including the last \\0 . User space buffer In system calls like read or write , we need to read from or write to user space buffers. We can use copyin or copyout here. For example: int foo_read ( unsigned char * ubuf , size_t len ) { int err ; void * kbuf = kmalloc ( len ); if (( err = copyin (( const_userptr_t ) ubuf , kbuf , len )) != 0 ) { kfree ( kbuf ); return err ; } if (( err = copyout ( kbuf , ( userptr_t ) ubuf , len )) != 0 ) { kfree ( kbuf ); return err ; } return 0 ; }","tags":"os161","title":"OS161: Arguments Passing in System Call"},{"url":"http://jhshi.me/2012/03/21/os161-general-tips-for-system-call/index.html","text":"Here are some practice that will hopefully make you feel more comfortable and more productive when you poking around with os161 syscalls. Tired of bmake & bmake install every time? Edit $OS161_SRC/mk/os161.kernel.mk , find this line: all : includelinks . WAIT $ ( KERNEL ) Add some lines below it: all : includelinks . WAIT $ ( KERNEL ) # generate tags for ctags , excluding some directories cd $ ( TOP ); ctags - R -- exclude = '.git' -- exclude = 'build' -- exclude = 'kern/compile' .; cd - # automatically execute bmake install after bmake bmake install Then a single bmake will automatically generate tags for your source file as well as install the executable. Work on file system calls first Work on file system calls and make them work correctly first, since user level I/O functions (most importantly printf ) rely heavily on sys_write and sys_read of console. If you first work on the process system calls, how would you assure your code is right? Without a working and correct printf , most of the test programs won't work. Test your code Test programs in $OS161_SRC/user/testbin are very helpful when you want to test your code, especially badcall(asst2) , filetest , crash (for kill_curthread ), argtest (for execv ) and forktest . You can use the p command provided by os161 kernel menu to execute this test programs: OS/161 kernel [? for menu] : p / testbin / argtest abc def ghi jkl mno p Use GDB Without GDB, you're dead. It's really worth spending some time to learn the basic usage of gdb. An upset fact is that you can not watch user level code (or you don't want to bother), so use the \" printf debug method\" in user code. Here are a few excellent gdb tutorials that you'll probably find helpful. GDB Tutorial from CMU Tips from Harvard","tags":"os161","title":"OS161: General Tips for System Call"},{"url":"http://jhshi.me/2012/03/21/os161-how-to-add-a-system-call/index.html","text":"Let's use the fork system call as an example. For convinience, let's assume $OS161_SRC is your os161 source root directory. How is a system call defined? Take a look at $OS161_SRC/user/lib/libc/arch/mips/syscalls-mips.S . We can see that a macro called SYSCALL(sym, num) is defined. Basically, this macro does a very simple thing: fill $v0 with SYS_##sym and jump to the common code at __syscall . Two points to note here: SYS_##sym is a little compiler trick. ##sym will be replaced by the actual name of sym . In our case ( SYSCALL(fork, SYS_fork) ), here sym is actually fork , so SYS_##sym will be replaced by SYS_fork . See this gcc document if you want know more details about it. The second argument of the macro, num , is unused here. Then in __syscall , the first instruction is the MIPS syscall instruction . We'll discuss the details of this instruction later. After this, we check $a3 value to see if the syscall is successful and store the error number ( $v0 ) to errno if not. $OS161_SRC/build/user/lib/libc/syscall.S is generated according to $OS161_SRC/user/lib/libc/arch/mips/syscall-mips.S during compiling, and this file is the actual file that be compiled and linked to user library. We can see that besides the SYSCALL macro and the __syscall code, declarations of all the syscalls are added here. So when we call fork in user program, we actually called the assembly functions defined in this file. How a system call get called? The MIPS syscall instruction will cause a software interruption. (See MIPS syscall function ). After this instruction, the hardware will automatically turn off interrupts, then jump to the code located at 0x80000080 . From $OS161_SRC/kern/arch/mips/locore/exception-mips1.S , we can see that mips_general_handler is the code that defined at 0x80000080 . The assembly code here do a lot of stuff that we don't need to care. All we need to know that they will save a trapframe on current thread's kernel stack and call mips_trap in $OS161_SRC/kern/arch/mips/locore/trap.c . Then if this trap (or interruption) is caused by syscall instruction, mips_trap will call syscall in $OS161_SRC/kern/arch/mips/syscall/syscall.c to handle. Then we go to our familiar syscall function, we dispatch the syscall according to the call number, then collect the results and return. If every thing is OK, we go back to mips_trap , then to the assembly code common_exception and then go back to user mode. How to add a system call To add a system call, a typical flow would be: Add a case branch in the syscall function: case SYS_fork : err = sys_fork ( & retval , tf ); break ; Add a new header file in $OS161_SRC/kern/include/kern , declare your sys_fork Include your header file in $OS161_SRC/kern/include/syscall.h so that the compiler can find the definition of sys_fork Add a new c file in $OS161_SRC/kern/syscall , implement your sys_fork function Add your c file's full path to $OS161_SRC/kern/conf/conf.kern so that your c file will be compiled. See loadelf.c and runprogram.c entries in that file for examples. Then in $OS161_SRC/kern/conf , reconfigure the kernel : $ ./config ASST3","tags":"os161","title":"OS161: How to Add a System Call"},{"url":"http://jhshi.me/2012/03/18/os161-process-scheduling/index.html","text":"OS161 provides a simple round-robin scheduler by default. It works like this: hardclock from $OS161_SRC/kern/thread/clock.c will be periodically called (from hardware clock interrupt handler) Two functions may be called there after: schedule to change the order the threads in ready queue, which currently does nothing thread_consider_migraton to enable thread migration among CPU cores Then it will call thread_yield to cause the current thread yield to another thread We need to play with the schedule function to give interactive threads higher priority. Why give priority to interactive threads? There are two reasons about this (at least the two in my mind) : Your time is more valuable than computer's . So in general, we should first serve those threads that interact with you. For example, you don't want to wait the computer in a shell while it's busy doing backup, right? Interactive threads tend to be I/O bound, which means they often get stuck waiting for input or output. So they normally fail to consume their granted time slice. Thus we can switch to computation bound threads when they stuck and boost computer utilization. How can we know whether a thread is interactive or not? As said above, interactive threads are normally I/O bound. So they often need to sleep a lot. In $OS161_SRC/kern/thread/thread.c , we can see that thread_switch is used to actually switch between threads. The first argument is newstate , which give some hints about the current thread. If newstate is S_READY , it means that current thread has consumed all its time slice and is forced to yield to another thread (by hardware clock). So we can guess that it's not interactive, or, it's computation intensive. However, if newstate is S_SLEEP , then it means current thread offers to yield to another thread , maybe waiting for I/O or a mutex. Thus we can guess that this thread is more interactive, or, it's I/O intensive. So by the newstate , we can make a good guess of current thread. How to implement it? Multi-Level Feedback Queue seems to be a good enough algorithm in this case. We can add a priority field in struct thread and initiate it as medium priority in thread_create . Then in thread_swith , we can adjust current thread's priority by the newstate . If it's S_SLEEP then we increase current thread's priority. Otherwise, if it's S_READY then we decrease current thread's priority. Of course, we can only support a finite priority level here, so be careful with boundary case . For example, if current thread is already the highest priority and still request S_SLEEP , then we just leave it in that priority. Then in schedule , we need to find the thread with highest priority among all the threads in curcpu->c_runqueue , and bring it to head . Current CPU's run queue is organized as a double linked list with head element. $OS161_SRC/kern/include/threadlist.h provides several useful interface to let us manipulate the list. Find a maximum/minimum number among a list is so simple that I won't provide any details here. But note that the head element is just a place holder . So you may want to start from curcpu->c_runqueue.tl_head.tln_Next and stop when elem->tln_next == NULL . Once find the thread, we need to bring it to list head so we can leave thread_switch unchanged. A threadlist_remove followed by threadlist_addhead will be sufficient here. One problem of MLFQ is starvation . So you may want to periodically reset all threads' priority to medium level for fairness. That's all. Here's just a work solution. Much work has be done if you want better scheduling for performance.","tags":"os161","title":"OS161 Process Scheduling"},{"url":"http://jhshi.me/2012/03/14/os161-file-system-calls/index.html","text":"Assume you've read my previous post on file operations in OS161 , then everything is quite straightforward. One more thing, remember to protect every access to the file descriptor data structure using lock! Let's get started. sys_open and sys_close We'll rely on vfs_open to do most of the work. But before that, we need to check: Is filename a valid pointer? (alignment, NULL, kernel pointer, etc.) Is flags valid? flags can only contain exactly one of O_RDONLY , O_WRONLY and O_RDWR After these, we need to allocate a fd to the opened file: just scan the curthread->t_fdtable and find a available slot ( NULL ). Then we need to actually open the file using vfs_open . Note that we need to copy filename into kernel buffer using copyinstr , for both security reasons, and that vfs_open may destroy the pathname passed in. Once vfs_open successfully returns, we can initialize a struct fdesc . Pay special attention to fdesc->offset . Without O_APPEND , it should be zero. But with O_APPEND , it should be file size. So we need to check it and use VOP_STAT to get file size if necessary. sys_close is quite easy. We first decrease the file reference counter. And close the file using vfs_close and free the struct fdesc if the counter reaches 0. sys_read and sys_write As usual, before do anything, first check the parameters. The main work here is using VOP_READ or VOP_WRITE together with struct iovec and struct uio . kern/syscall/loadelf.c is a good start point. However, we need to initialize the uio for read/write for user space buffers . That means the uio->uio_segflg should be UIO_USERSPACE . Note that uio->uio_resid is how many bytes left after the IO operation. So you can calculate how many bytes are actually read/written by len - uio->uio_resid . Since we've carefully handled std files when initialization. Here we just treat them as normal files and pay no special attention to them. sys_dup2 The hardest thing here is not how to write sys_dup2 , but instead how dup2 is supposed to be used. Here is a typical code snippet of how to use dup2 int logfd = open ( \"logfile\" , O_WRONLY ); /* note the sequence of parameter */ dup2 ( logfd , STDOUT_FILENO ); close ( logfd ); /* now all print content will go to log file */ printf ( \"Hello, OS161. \\n \" ); We can see that in dup2(oldfd, newfd) : After dup2 , oldfd and newfd points to the same file. But we can call close on any of them and do not influence the other. After dup2 , all read/write to newfd will be actually performed on oldfd . (Of course, they points to the same file!!) If newfd is previous opened, it should be closed in dup2 ( according to dup2 man page ). Once we're clear about these. Coding sys_dup2 is a piece of cake. Just don't forget to maintain the fdesc->ref_count accordingly. sys_lseek , sys_chdir and sys__getcwd Nothing to say. Use VOP_TRYSEEK , vfs_chidr and vfs_getcwd respectively. Only one thing, if SEEK_END is used. use VOP_STAT to get the file size, as we did in sys_open 64-bit parameter and return value in lseek This is just a minor trick. Let's first see the definition of lseek off_t lseek ( int fd , off_t pos , int whence ) And from $OS161_SRC/kern/include/types.h , we can see that off_t is type-defined as 64-bit integer ( i64 ). So the question here is: how to pass 64-bit parameter to sys_lseek and how get the 64-bit return value of it. Pass 64-bit argument to sys_lseek From the comment in $OS161_SRC/kern/arch/mips/syscall/syscall.c , we can see that, fd should be in $a0 , pos should be in ( $a2:$a3 ) ( $a2 stores high 32-bit and $a3 stores low 32-bit) , and whence should be in sp+16 . Here, $a1 is not used due to alignment. So in the switch branch of sys_lseek , we should first pack ( $a2:$a3 ) into a 64-bit variable, say sys_pos . Then we use copyin to copy whence from user stack ( tf->tf_sp+16 ). Get 64-bit return value of sys_lseek Also from the comment, we know that a 64-bit return value is stored in ( $v0:$v1 ) ( $v0 stores high 32-bit and $v1 stores low 32-bit). And note that after the switch statement, retval will be assigned to $v0, so here we just need to copy the low 32-bit of sys_lseek 's return value to $v1, and high 32-bit to retval .","tags":"os161","title":"OS161 File System Calls"},{"url":"http://jhshi.me/2012/03/14/os161-file-operation-overview/index.html","text":"In user space, when open a file, user program will get a file descriptor (a integer) that represent that file. User can use this descriptor to perform various operations on this file: read, write, seek, etc. As I see it, this design is quite clean in that: Hide most of the details from user, for both safety and simplicity Enable more high level abstraction: everything (socket, pipe..) is a file The file descriptor is actually an index to kernel space structure that contains all the details of opened files. So at kernel side, we need to do a lot bookkeeping stuff. What information should be kept? It's helpful to take a look at $OS161_SRC/kern/include/vnode.h . In a nutshell, a file is represented by a struct vnode in kernel space. And most of the underlying interfaces that help us to manage files have already been provided. All we need to do is just bookkeeping. So basically, we need to record the following details about a file: File name. We don't actually need this, but just in case. For example, we may want to print a file's name when debuging. Open flags. We need to keep the flags passed by open so that later on we can check permissions on read or write. File offset. We definitely need this. File's reference counter. Mainly for dup2 and fork system call A lock to protect the access to this file descriptor. Since it's possible that two threads share the same copy of this bookkeeping data structure (e.g., after fork ) Actual pointer to the file's struct vnode Why we didn't record the file's fd? Please see next section. File descriptor Allocation There are some common rules about file descriptor: 0, 1 and 2 are special file descriptors. They are stdin, stdout and stderr respectively. (Defined in $OS161_SRC/kern/include/kern/unistd.h as STDIN_FILENO , STDOUT_FILENO and STDERR_FILENO ) The file descriptor returned by open should be the smallest fd available. (Not compulsory though) fd space is process specific , i.e. different process may get the same file descriptor that represent different files So, to maintain each process's opened file information, we add a new field to struct thread /* OPEN_MAX is defined in $OS161_SRC/kern/include/limits.h */ struct fdesc * t_fdtable [ OPEN_MAX ]; Now you may figure out why there isn't a fd filed in struct fdesc , since its index is the fd! So when we need to allocate a file descriptor, we just need to scan the t_fdtable (from STDERR_FILENO+1 , of course), find an available slot ( NULL ) and use it. Also, since it's a struct thread field, it's process specific. Does the t_fdtable look familiar to you? Yes, it's very similar to our process array, only that the later is system-wise. (Confused? See my previous post on fork ) t_fdtable Management and Special Files Whenever you add a new field to struct thread , don't forget to initialize them in thread_create and do clean up in thread_exit and/or thread_destroy . Since t_fdtable is an fixed size array, work a lot much easier: just zero the array when create, and no clean up is needed. Also, t_fdtable are supposed to be inheritable: so copy a parent's t_fdtable to child when do sys_fork . Since parent and child thread are supposed to share the same file table, so when copy file tables, remember to increase each file's reference counter. Console files (std in/out/err) are supposed to be opened \"automatically\" when a thread is created , i.e. user themselves don't need to open them. At first glance, thread_create would be a intuitive place to initialize them. Yes, we can do that. But be noted that when the first thread is created, the console is even not bootstrapped yet , so if you open console files in thread_create , it'll fail (silently blocking...) at that time. Update : The right way to do this is to initialize console in runprogram , because that's where the first user thread is born. And later user threads will just inherits the three file handles of console from then on. BTW, how to open console ? The path name should be \"con:\", flags should be: O_RDONLY for stdin, O_WRONLY for stdout and stderr; options should be 0664 (Note the zero prefix, it's a octal number).","tags":"os161","title":"OS161 File Operation Overview"},{"url":"http://jhshi.me/2012/03/12/os161-exit-and-waitpid-system-call/index.html","text":"Before going on, assume you've read my previous post on pid management Thanks to the struct process , our work is much simplified. Quoting Eric S.Raymond here. Smart data structures and dumb code works a lot better than the other way around. sys_waitpid At first glance, the logic of waitpid is trivial. Yes, it's indeed in terms of the \"core code\": Just acquire the exitlock and then see if the process has exited, then wait it exit using cv_wait on exitcv and get it's exitcode. Here I use cv to coordinate child and parent process. Or you can use semaphore with initial count 0: child will V the semaphore when it exits, and parent will P the semaphore on waitpid . But it turns out that most the code of waitpid is argument checking! More arguments means more potential risks from user space. Sigh~ Anyway, we are doing kernel programming. And just take a look at $OS161_SRC/user/testbin/badcall/bad_waitpid.c and you'll know what I mean. So basically, we need to check: Is the status pointer properly aligned (by 4) ? Is the status pointer a valid pointer anyway (NULL, point to kernel, ...)? Is options valid? (More flags than WNOHANG | WUNTRACED ) Does the waited pid exist/valid? If exist, are we allowed to wait it ? (Is it our child?) And also, after successfully get the exitcode, don't forget to destroy the child's process structure and free its slot in the procs array. Since one child has only one parent, and after we wait for it, no one will care for it any more! sys_exit This part is easy. (Mostly because exit only take on integer argument!) All we need to do is find our struct process entry using curthread->t_pid . And then indicate that \"I've exited\" and fill the exitcode. The only thing to note that the exitcode must be maked using the MACROs in $OS161_SRC/kern/include/kern/wait.h . Suppose user passing in _exitcode , then we need to set the real exitcode as _MKWAIT_EXIT(_exitcode) . And if we are smarter, we can first check if parent exist or if parent has exited, then we even don't bother fill the exitcode since no one cares! Anyway, it's just a tiny shortcut, and totally optional.","tags":"os161","title":"OS161 exit and waitpid System Call"},{"url":"http://jhshi.me/2012/03/12/os161-pid-management/index.html","text":"There are many way to manage each process's pid. Here is the way I do it. I decided to make minimal modification to $OS161_SRC/kern/thread/thread.c , in case anything is ruined. So I only add two things to the thread module. One is I add a t_pid field to struct thread so that getpid system call is trivial. Another is I add a call of pid_alloc in thread_alloc to initialize new thread's t_pid . That's it. No more touch on the thread module. The process Structure In os161, we stick to the 1:1 process:thread model. That is, a process has and only has one thread. Thus process and thread are basically the same thing in this scenario. However, I still decided to use a struct process to do process bookkeeping stuff. It's independent to struct thread and outside the thread module. Thus when a thread exits and its thread structure is destroyed. I still have its meta-data (e.g. exitcode) stored in the process structure. So, what should we record about a process? As we already have the struct thread to record most of the information about a thread, we just use a pointer to struct thread to get all these information. What we do in struct process is mainly for our waitpid and exit system call. So we should keep the information of: Its parent's (if any) pid Whether a process has exited If this process has exited, then the exitcode Synchronous facilities to protect the exit status (lock, cv, samophore, etc) Of course a pointer to struct thread So the structure looks like: struct process { pid_t ppid ; struct semphore * exitsem ; bool exited ; int exitcode ; struct thread * self ; }; Pid allocation For convenience and simplicity, I decided to support a maximum number of MAX_RUNNING_PROCS (256) processes in the OS, regardless the __PID_MAX (32767) macro in $OS161_SRC/kern/inlude/kern/limits.h . So I just use a global static array of struct process* to maintain all the processes in system. Of course it's very dumb but hope it's sufficient for a toy OS like 161. Then allocate a pid is very easy, just scan the process array and find a available slot ( NULL ). One important thing to note is that leave pid=0 alone and do not use it. Since in /kern/include/kern/wait.h , there are two special MACROs: #define WAIT_ANY (-1) #define WAIT_MYPGRP (0) That is, pid = 0 has a special meaning. So we'd better not use it, staring allocate pid from 1. We can also see this from the __PID_MIN (2) macro in $OS161_SRC/kern/inlude/kern/limits.h . Once a available slot is found, we need to create a struct process and initialize it appropriately, especially it's ppid (-1 or other invalid value).","tags":"os161","title":"OS161 pid Management"},{"url":"http://jhshi.me/2012/03/11/os161-execv-system-call/index.html","text":"Basically, execv do more or less the same thing with runprogram in $OS161_SRC/kern/syscall/runprogram.c . The overall flow of sys_execv are: Copy arguments from user space into kernel buffer Open the executable, create a new address space and load the elf into it Copy the arguments from kernel buffer into user stack Return user mode using enter_new_process Note that I highlighted step 1 and 3 since they are the trickiest part of execv , step 2 and 4 are just the same with runprogram . Format of uargs The first argument is progname (e.g., /testbin/argtest ), and the second argument is uargs , it's an array of pointers, each pointer points to a user space string. The last pointer of uargs is NULL . Since we don't know how many arguments are there in uargs , we need to copy the pointers one by one using copyin until we encounter a NULL . Copy arguments into kernel buffer In whichever way to do this, one of step 1 and 3 must be complicated. I choose to carefully pack the arguments into a kernel buffer and then just directly copy this buffer into user stack in bulk. Note that in MIPS, pointers must be aligned by 4 . So don't forget to padding when necessary For convenience, assume that arguments are { foo , os161 , execv , NULL }. Then after packing, my kernel buffer looks like this: Typo : kargv[2] should be 28, instead of 26. Note that kargv[i] stores the offset of the i'th arguments within the kargv array, since up to now we don't know their real user address yet. Copy the arguments into user stack Why user stack, not anywhere else? Because it's the only space we know for sure. We can use as_define_stack to get the value of initial stack pointer (normally 0x80000000 , aka USER_SPACE_TOP ). So what we do is Fill kargv[i] with actual user space pointer, and Copy kargv array into the stack Minus stackptr by the length of kargv array. Note that we must modify kargs[i] before we do the actual copy , otherwise some weird bus error or TLB miss will occur. The steps are shown as follows (here we assume stackptr initial value is 0x80000000 ):","tags":"os161","title":"OS161 execv System Call"},{"url":"http://jhshi.me/2012/03/11/os161-fork-system-call/index.html","text":"If you're not already familiar with UNIX fork system call, here is it's function description and its entry on Wikipedia . Basically, in sys_fork , we need to do the follow things: Copy parent's trap frame, and pass it to child thread Copy parent's address space Create child thread (using thread_fork ) Copy parent's file table into child Parent returns with child's pid immediately Child returns with 0 So, let's get started. Pass parent's trap frame to child thread Trap frame ( struct trapframe ) records the exact state (e.g. registers, stack, etc.) of parent when it call fork. Since we need the child exactly the same with parent (except for return value of fork), we need child thread to start run with parent's trap frame. So we need to pass parent's trapframe pointer to sys_fork , and store a full copy of it in kernel heap (i.e., allocated by kmalloc ). Then pass the pointer to child's fork entry function (I called it child_forkentry ). Copy parent's address space We can use the as_copy facility to do this. Note that as_copy will allocate a struct addrspace for you and also copy the address space contents, so you don't need to call as_create by yourself. Create Child Thread thread_fork will create a new child thread structure and copy various fields of current thread to it. Again, you don't need to call thread_create by yourself, thread_fork will call it for you. You can get the pointer of child's thread structure by the last argument of thread_fork . Parent's and Child's fork return different values This is the trickiest part. You may want to take a look at the end of syscall to find out the convention of return values. That is: on success, $a3 stores 0, and $v0 stores return value (or $v0:$v1 if retval is 64-bit); on failure, $a3 stores 1, and $v0 store error code . Parent part is quite easy, after call thread_fork , just copy current thread's file table to child, and other book-keeping stuff you need to do, and finally, return with child's pid, and let syscall deal with the rest. Child part is not that trivial. In order to let child feel that fork returns 0, we need to play with the trapframe a little bit. Remember that when we call thread_fork in parent's sys_fork , we need to pass it with an entry point together with two arguments ( void* data1, unsigned long data2 ). As said before, I name the entry point as child_forkentry , then what should we pass to it? Obviously, one is parent's trapframe copy (lies in kernel heap buffer) and another is parent's address space! Once we've decided what to pass, how to pass is depend on your preference. One way is to pass trapframe pointer as the data1 , and address space pointer as data2 (with explicit type-case, of course). Another way may be we pass trapframe pointer as data1 , and assign the address space pointer to $a0 since we know fork takes no arguments. child_forkentry Ok, now child_forkentry becomes the first function executed when child thread got run. First, we need to modify parent's trapframe's $v0 and $a3 to make child's fork looks success and return 0. Also, don't forget to forward $epc by 4 to avoid child keep calling fork. (BTW, we don't need to do this in parent since syscall will take care of this.). Then we need to load the address space into child's curthread->t_addrspace and activate it using as_activate . Finally, we need to call mips_usermode to return to user mode. But before that, we need to_ copy the modified trapframe from kernel heap to stack_ since mips_usermode check this ( KASSERT(SAME_STACK(cpustacks[curcpu->c_number]-1, (vaddr_t)tf)) . How? Before call mips_usermode , just declare a struct trapframe (note: not pointer) and copy the content into it, then use its address as parameter to call mips_usermode . Synchronization Note that thread_fork will set newly created child thread runnable and try to switch to it immediately. So it's highly possible that before thread_fork returns, the child thread is already running. This is not desired since we need to copy other stuff, like file table, to child thread after thread_fork . We definitely don't want the child thread running without a file table. So we need to prevent child thread from running until parent thread set everything up. So we need to disable interrupts before thread_fork using splhigh , and restore the old interrupt level using splx after parent thread is done. Update : Disable interrupt does not necessarily stop child from running. If you adopt this approach, you need to use some synchronization primitives to coordinate between parent and child. Or better, you can modify thread_fork , copy whatever you need to copy (e.g., file table) before thread_make_runnable . Thus you won't have synchronization issue.","tags":"os161","title":"OS161 fork System Call"},{"url":"http://jhshi.me/2012/02/14/quick-switch-between-source-and-header-files-in-vim/index.html","text":"There are many ways to do this, as listed in vim wiki . I tried the script way ( a.vim , but not feel comfortable. Because: I'm doing kernel development, so I have a bunch of my own stdio.h , stdlib.h , etc. But a.vim will bring you into the system include path, not my own Even though I jumped to the right space, jump back is not easy Finally, I found the ctags way very usable. Issue this command in your source tree root, $ ctags --extra = +f -R . Then in vim, you can just type :tag header.h to jump to header.h and use your familiar ctrl+t to jump back, very intuitive. Plus, I found a gf command of vim that can jump to the file under cursor, but with the same drawbacks as a.vim , thus not adorable. UPDATE Here is a Vim Wiki talking about how to jump back and forth using {%key Ctrl %}-I and {%key Ctrl %}-O, which is kind of sweet. Thanks @Partha Bera for point that out.","tags":"vim","title":"Quick switch between source and header files in Vim"},{"url":"http://jhshi.me/2012/02/14/error-function-declaration-isnt-a-prototype/index.html","text":"This error occurs when you try to declare a function with no arguments, and compile with -Werror=strict-prototypes , as follows: int foo (); Fix it by declare it as int foo ( void ); This is because in c, foo(void) takes no arguments while foo() takes a infinite number of arguments. Thanks to this stackoverflow post","tags":"errors","title":"error: function declaration isn't a prototype"},{"url":"http://jhshi.me/2012/02/11/directly-install-sty-files-using-yum/index.html","text":"Use the following command to install sty files, say multirow.sty , using yum: $ sudo yum -y install 'tex(multirow.sty)'","tags":"latex","title":"Directly install sty files using yum"},{"url":"http://jhshi.me/2012/01/31/fix-for-google-calendar-crash-in-chromechromium-on-fedora/index.html","text":"This problem is caused by the collision of chrome/chromium sandbox and Fedora's SELinux, as explained here . The same problem occurs when you open twitter (see this ). The solution is restorecon -R ~/.config # install restorecond su -c 'yum install policycoreutils-restorecond' # enable it su -c 'chkconfig restorecond on'","tags":"errors","title":"Fix for Google Calendar Crash in Chrome/Chromium on Fedora"},{"url":"http://jhshi.me/2011/11/24/bash-alias-with-argument/index.html","text":"Alias is a very useful feature of shell (e.g. bash). For example, I have this line in my .bashrc : alias ll = \"ls -alF | more\" So I can simply use ll to view all the files in current directory and view them in my favorite style. It works fine until one day, I want to view the files in a sub directory instead of current directory, so I tried: $ ll subdirectory/ But it failed - still just display the content of current directory. The reason is, for bash, the above command is interpreted as: $ ls -alF | more subdirectory/ But what I have in mind is actually: $ ls -alF subdirectory | more I Googled and found that alias can just not take arguments, but devise a simple functions is applicable, so I have the below code instead of the ll alias: unalias ll function ll (){ ls -alF \" $@ \" | more ; } We need to first unalias since by default, ll is aliased as ls -l --color=auto . If we don't remove the alias, our function won't be invoked.","tags":"linux","title":"Bash Alias with Argument"},{"url":"http://jhshi.me/2011/11/18/center-the-caption-of-figure-or-table/index.html","text":"In latex, the caption of figure or table is aligned left by default. Sometimes, it's not that beautiful, especially when your article is two column. To fix this, use the caption package with center as the option. \\usepackage[center]{caption} If you like, you can also substitute center with left or right Here is a detailed manual of caption package.","tags":"latex","title":"Center the Caption of Figure or Table"},{"url":"http://jhshi.me/2011/11/17/figure-over-two-columns-in-latex/index.html","text":"You may often find a table or figure is too big to fit into one column when your article has two columns. Use this to insert a figure (same with table) and it will save you: \\begin { figure* } % figure here \\end { figure* } Note the star ( * ) appended after figure? That's the trick.","tags":"latex","title":"Figure Over Two Columns in Latex"},{"url":"http://jhshi.me/2011/11/13/set-sty-file-path-of-latex-in-fedora-linux/index.html","text":"From time to time, you may want to compose your own sty files to eliminate long header in your tex file. But it's boring to put you sty file in the same directory of your tex file every time since you really want your sty file to be common, i.e., can be accessed everywhere in your system. One way to achieve this is put your sty file in the system latex directory (e.g. /usr/share/texlive/texmf-dist/tex/latex ), and then use texhash to refresh the database. But if you would prefer not touch the system directory and want to put the sty file somewhere that easy to access and backup, that will not be a very good practice. So you can just tell latex where to find its sty files, i.e., set the system sty file looking up path. Do this by editing /usr/share/texlive/texmf/web2c/texmf.cnf . Find the variable called TEXINPUTS.tex , add your own sty path there. Don't forget to separate the directory using \";\" and append \"//\" to the last directory. When you finish, execute texhash in a terminal. Then you can just feel free to put your sty files in the directory you just specified, latex will now know where to find them. BTW, the comments in texmf.cnf are very useful if you want to do any other tricks.","tags":"latex","title":"Set sty file path of Latex in Linux"},{"url":"http://jhshi.me/2011/11/10/pdflatex-error-while-loading-shared-libraries-libpoppler-so-13/index.html","text":"This error occurs when I try to use latex after upgrading to fedora 16. After Google it, I find the reason may be when upgrading, the pregrade just update the system components, but not user applications, such as texlive . And since current version texlive counts on some older libraries, issues occur. I find that the current libpoppler in /usr/lib is libpoppler.so.18 , so I made a symbolic link to it: sudo ln -s /usr/lib/libpoppler.so.18 /usr/lib/libpoppler.so.13 This fixes the problem. Thanks to this post","tags":"errors","title":"pdflatex: error while loading shared libraries: libpoppler.so.13"}]}