Title photo
frugal technology, simple living and guerrilla large-appliance repair

Regular blog here, 'microblog' there

Many of my traditional blog post live on this site, but a great majority of my social-style posts can be found on my much-busier microbloging site at updates.passthejoe.net. It's busier because my BlogPoster "microblogging" script generates short, Twitter-style posts from the Linux or Windows (or anywhere you can run Ruby with too many Gems) command line, uploads them to the web server and send them out on my Twitter and Mastodon feeds.

I used to post to this blog via scripts and Unix/Linux utilities (curl and Unison) that helped me mirror the files locally and on the server. Since this site recently moved hosts, none of that is set up. I'm just using SFTP and SSH to write posts and manage the site.

Disqus comments are not live just yet because I'm not sure about what I'm going to do for the domain on this site. I'll probably restore the old domain at first just to have some continuity, but for now I like using the "free" domain from this site's new host, NearlyFreeSpeech.net.

Wed, 07 Dec 2016

Text processing in node (i.e. in JavaScript)

My last text processing project started in Bash, which which I'm more familiar, and then took a turn toward Ruby before returning to Bash when deadlines got tight.

Now I'm thinking about the next election-results script, which won't be using XML from the state of California but instead the space-delimited ASCII from Los Angeles County. Another developer handled that task in November, but I want to take a crack at it for March 2017.

My goal is a "universal" script that can work on any results file that the county provides without requiring a lot of hacking for individual races in any given election.

In other words, I want to write once, run many times.

I could do it in Bash. Or Ruby. But I might want to try JavaScript and run it with Node on the server (or, if the election is "small" enough, client-side in the browser).

LA County data is not standard. It's not XML or JSON (though the county DOES use JSON in its own results, it does not share that data with the media).

Instead, the county uses what appears to be a home-grown data format that is arcane yet well-documented.

Each line begins with an alphanumeric code, and data fields are placed on those lines at predetermined character lengths and predetermined positions.

So a script would have to create substrings of the data from each line. I'm thinking that I'll use the script to either create XML that I would then convert, or to skip that step and create JSON directly from the county's data.

Doing it in JavaScript would be an opportunity to learn more about the language (just like it would be for Ruby if I used that language; and the jury is most definitely out).

What muddies the water considerably is the fact that my company is also following elections in San Bernardino, Riverside and Orange counties. I know that San Bernardino doesn't really provide data at all. I generally scrape their web page on Election Night. I don't know what Riverside and Orange do.

So I'm going to focus on LA County for now. Another developer wrote the front-end code for the election-results display, and all I have to do is provide the JSON. I wouldn't be opposed to writing the whole app, but for now a "smaller" bite is a more realistic one.