Tag Archives: Twitter

Introducing Twitter.jl

By: randyzwitch - Articles

Re-posted from: http://randyzwitch.com/twitter-api-julia/

This is possibly the latest “announcement” of a package ever, given that Twitter.jl has actually existed on METADATA for nearly a year now, but that’s how things go sometimes. Here’s how to get started with Twitter.jl and some of the highlights.

Hello, World!

If ‘Hello, World!’ is the canonical example of getting started with a programming language, the Twitter API is becoming the first place to start for people wanting to learn about APIs. Authenticating with the Twitter API using Julia is similar to using the R or Python packages, except that rather than doing the OAuth “dance”, Twitter.jl takes all four authentication values in one function:All four of these values can be found after registering at the Twitter Developer page and creating an application. Having all four values in your script is less secure than just providing the api key and api secret, but in the future, I’ll likely implement the full OAuth “handshake”. One thing to keep in mind with this function as it currently works is that no validation of your credentials is performed; the only thing this function does is define a global variable twittercred for later use by the various functions that create the OAuth headers. To shout “Hello, World!” to all of your Twitter followers, you can use the following code:

General Package/Function Structure

From the example above, you can see that the function naming follows the Twitter REST API naming convention, with the HTTP verb first and the endpoint as the remainder of the function name. As such, it’s a good idea at this early package state to have the Twitter documentation open while using this package, so that you can quickly find the methods you are looking for.

For each function/API endpoint, I’ve gone through and determined which parameters are required; these are required arguments in the Julia functions. For all other options, each function takes a second optional Dict{String, String} for any option shown in the Twitter documentation. While this Dict structure allows for ultimate flexibility (and quick definition of functions!), I do realize that it’s less than optimal that you don’t know what optional arguments each Twitter endpoint allows.

As an example, suppose you wanted to search for tweets containing the hashtag #julialang. The minimum function call is as follows:By default, the API will return the 15 most recent tweets containing the #julialang hashtag. To return the most recent 100 tweets (the maximum per API ‘page’), you can pass the “count” parameter via the Options Dict:

Composite Types and DataFrames definitions

The Twitter API is structured into 4 return data types (Places, Users, Tweets, and Entities), and I’ve mimicked these types using Julia Composite Types. As such, most functions in Twitter.jl return an array of specific type, such as Array{TWEETS,1} from the prior #julialang search example. The benefit to defining custom types for the returned Twitter data is that rudimentary DataFrame methods have also been defined:

I describe these DataFrames as ‘rudimentary’ as they parse the top level of JSON into columns, which results in some DataFrame columns having complex data types such as Dict() (and within the Dict(), nested Dicts!). As a running theme in this post, this is something I hope to get around to improving in the future.

Want to Get Started Developing Julia? Start Here!

One of the common questions I get asked is how to get started with Julia, both from a learning perspective and from a package development perspective. Hacking away on the core Julia codebase is great if you have the ability, but the code can certainly be intimidating (the people are quite friendly though). Creating a package isn’t necessarily hard, but you have to think about an idea you want to implement. The third alternative is…

…improve the Twitter package! If you go to the GitHub page for Twitter.jl, you’ll see a long list of TODO items that need to be worked on. The hardest part (building the OAuth headers) has already been taken care of. What’s left is re-factoring the code for simplification, factoring out the OAuth code in general into a new Julia library (also partially started), then building the Streaming API functions, cleaning up the DataFrame methods to remove the Dict column types, paging through API results…and so-on.

So if any of you are on the sidelines wanting to get some practice on developing packages, without needing to worry about learning Astrophysics first, I’d love to collaborate. And if any Julia programming masters want to collaborate, well that’s great too. All help and pull requests are welcomed.

In the meantime, hopefully some of you will find this package useful for natural language processing, social networking analysis or even creating bots 😉

Code Refactoring Using Metaprogramming

By: randyzwitch - Articles

Re-posted from: http://randyzwitch.com/julia-metaprogramming-refactoring/

It’s been nearly a year since I wrote Twitter.jl, back when I seemingly had MUCH more free time. In these past 10 months, I’ve used Julia quite a bit to develop other packages, and I try to use it at work when I know I’m not going to be collaborating with others (since my colleagues don’t know Julia, not because it’s bad for collaboration!).

One of the things that’s obvious from my earlier Julia code is that I didn’t understand how powerful metaprogramming can be, so here’s a simple example where I can replace 50 lines of Julia code with 10.

CTRL-A, CTRL-C, CTRL-P. Repeat.

Admittedly, when I started on the Twitter package, I fully meant to go back and clean up the codebase, but moved onto something more fun instead. The Twitter package started out as a means of learning how to use the Requests.jl library to make API calls, figured out the OAuth syntax I needed (which itself should be factored out of Twitter.jl), then copied-and-pasted the same basic function structure over and over. While fast, what I was left with was this (currently, the help.jl file in the Twitter package):It’s pretty clear that this is the same exact code pattern, right down to the spacing! The way to interpret this code is that for these five Twitter API methods, there are no required inputs. Optionally, there is the ‘options’ keyword that allows for specifying a Dict() of options. For these five functions, there are no options you can pass to the Twitter API, so even this keyword is redundant. These are simple functions so I don’t gain a lot by way of maintainability by using metaprogramming, but at the same time, one of the core tenets of programming is ‘Dont Repeat Yourself’, so let’s clean this up.

For :symbol in symbolslist…

In order to clean this up, we need to take out the unique parts of the function, then pass them as arguments to the @eval macro as follows:
What’s happening in this code is that I define two tuples: one of function names (as symbols, denoted by ‘:’ ) and one of the API endpoints. We can then iterate over the two tuples, substituting the function names and endpoints into the code. When the package is loaded, this code evaluates, defining the five functions for use in the Twitter package.

Wha?

Yeah, so metaprogramming can be simple, can it can also be mind-bending. It’s one thing to not repeat yourself, it’s another to write something so complex that even YOU can’t remember how the code works. But somewhere in between lies a sweet spot where you can re-factor whole swaths of code and streamline your codebase.

Metaprogramming is used throughout the Julia codebase, so if you’re interested in seeing more examples of metaprogramming, check out the Julia source code, the Requests.jl package (where I first saw this) or really anyone who actually knows what they are doing. I’m just a metaprogramming pretender at this point :)

 

To read additional discussion around this specific example, see the Julia-Users discussion at:

https://groups.google.com/forum/#!topic/julia-users/zvJmqB2N0GQ