I am a doctoral candidate in the department of Politics at Princeton University. I am native of Texas and received a B.A. in Government and History from the University of Texas at Austin.I am currently on the job market looking at positions in international relations, comparative politics, and security studies.My dissertation examines how governments exploit collective action problems of violent revolutionary or separatist political movements. I have ongoing projects on the domestic politics of war, the determinants of violence against civilians during civil war, and the domestic politics of military buildups.I post things here from time to time on research methods, BBQ, and the joys/agony of programing. |
||||
Google Books and Microsoft OneNote for coding dataAn encouraging norm is emerging where scholars release alongside their data, a large pdf of textual summaries and specific quotes used in the coding decision. I've tinkered with different systems for doing this including word, excel, and access, but what I have recently discovered works best is surprisingly Microsoft OneNote. OneNote offers at least four advantages so far. The first, is that it makes it very easy to organize raw information by case and then variable using pages and subpages. Second, it makes it very easy to get information into OneNote from sources like google books. Use zotero to download the book citation automatically, and drag and drop it into OneNote. Then use OneNote's screen capture option to quickly copy and paste the relevant page(s) out of google books. Third, OneNote will automatically OCR those book images for you allowing you to either search for words later or to copy and paste the text directly into a word doc. Fourth, OneNote will export to a word doc or a pdf, splitting sections based on the case and variable headings you set up in your pages and subpages which allows you reorganize thing easily before putting out a final product. Created 08 Aug 2011 21:03 Tags: Archival Research: Custom Zotero TranslatorsWith more and more archival material being put up on the web, it is important to have a system for downloading and organizing that material for your research. I use zotero for all of my citation management because 1) it automatically pulls cites and files from the web and 2) it can store them to the cloud so they follow you wherever you go. For specialized electronic archives, however, there may not be a ready made zotero translator available. This was the case for the amazing Vietnam Virtual Archive at Texas Tech, so I rolled my own translator using the directions at the links provided below. I've made the full code available for anyone in need of a quick fix now, and I'll put together something more substantial for the main zotero trunk when I get time. http://niche-canada.org/member-projects/zotero-guide/chapter1.html
Created 29 Jun 2011 16:37 Tags: archival-research zotero GIS: Vectorizing Political Boundaries from Complicated MapsAnyone who has tried to vectorize a paper map has struggled with the fact that maps are not designed to be cleanly read by computers, they are designed to cram as much information as possible in the smallest space that the human eye can interpret. Using a freeware image tool called GIMP, I have a quick and dirty way of removing the clutter and leaving only the political boundaries for vectorization. Take for example this political boundary map from the Vietnam War (click to zoom, warning big download at 12 meg). We just want the country boundaries, both between countries and between land and water. Unfortunately, the map is crammed with political boundaries, rivers, text, etc. You could physically trace the outside perimeter for every country, but that would be very time consuming. You could try just selecting the light blue for the ocean boundary and the black/yellow between the countries, but it will pick up a lot more than just the boundaries and the lines will be full of holes thanks to all of the text labels that crisscross the boundaries. A good alternative is to load the picture up in a program like GIMP, and use the fill tool to color each country and the ocean a distinguishing color. Don't worry about missing small bits, get the large parts filled in and pay particular attention to the border areas. If there are gaps in the borders, use the pencil tool to quickly plug the holes, and continue filling. Now use the color select tool, while holding down shift, to select all of the areas that are filled in. Create a new layer, and paste that selection to the new layer. What you will have will look like ugly swiss cheese from all the holes left by text, map features, and places you missed filling in. Now go to filters, and select the despeckle filter. GIMP will make an educated guess about what color should be in the holes base on the nearby colors. Play with the settings until you find a balance that works for you, and run the filter as many times as necessary to plug all the remaining holes. The final product will have a margin of error along the perimeter, but in many cases will be within tolerance and can be improved upon by more careful filling/despeckling. Most importantly, it can be done in a fraction of the time of other methods. Created 17 Jun 2011 19:03 Tags: gis maps vectorization Stata: Deleting Variables that are Completely MissingWhen merging or subsetting data sets, it is common to end up with variables that are missing for all remaining observations. This code provides a very fast method for removing variables that are completely missing while leaving variables that are nonmissing for at least one observation.
Created 17 Jun 2011 18:05 Tags: data-management stata Archival Research on an Industrial Scale Part 1Political scientists and historians face at least four major problems in conducting archival research: time, resources, identifying the key information, and making that information available to others for replication purposes. Together these problems either put serious archival work out of the reach of graduate students/junior faculty or they encourage brief/shallow trips where the exercise becomes can I find a document that supports my claim. Over the next several posts I am going to discuss one of the technological solutions I have developed as well as some online resources which are often overlooked. One technological fix is to digitize a large volume of documents during a brief research trip, OCR them, and then search for relevant terms back at home. There are many ways to accomplish this, but I have finally tweaked a system which allows for high volume (~10,000 pages a week), low human intervention (automated processing, OCR, filing), and ease of reading/distribution (a single pdf file per set of documents under a common name). The system ensures that I can find the information I am looking for, do it with a relatively short amount of time at the archive, and can cite/locate/share that document years later. The rough outline of my workflow is as follows:
The workflow has a number of nice properties. The tripod and remote control make it so that when flipping through a folder, the user need only stop long enough to mash the keyboard and wait for the shutter to click. I often fall into a rhythm where I can capture every page in the time it takes to skim the document to decide whether its worth digitizing, and it can actually be faster to digitize some documents rather than evaluate them. For documents where I know I will want every page, I turn on a time lapse option in my software for a picture every second and I just turn the page. The tall tripod, high resolution camera, and content aware software (Scan Tailor) allows the user to capture any sized/orientated document as well as the location information inscribed on the folder title. Scan Tailor and OmniPage ensure that the documents are cropped and oriented in the pdf for easy reading. My batch file has a number of optimizations including using a ram drive and a mulithreaded scheduler so that computer's full resources are utilized and processing time per image is low (~3 seconds). The only downside currently is that Omnipage is paid software and high resolution cameras with remote capabilities are somewhat expensive. Still, when you consider the cost of additional weeks in the field and of not having a replicable research design it, it more than pays for itself. Created 16 May 2011 18:02 Tags: archives history software |
Posts by datePosts by tags |
|||





