Posts Tagged ‘Data Visualization’

Techy Art: Algorithmic Imagery

I’m working on an interesting project– I’m writing programs that convert text to images based on the key strokes. It’s turning up some unusual results! My plan is to create a program that turns 140 characters into an image, so that people can tweet art! Ultimately, I would like to be able to upload a text file and have it turned into a png file, or even better, upload a png file and turn it into text! It’s not really an encryption scheme, since it can be easily deciphered, but I think it might be interesting digital artwork. For instance, Mike and I could upload the source code from our websites and create images from them– unique embodiments of our technical work!


I recently read an article on steganography- the art and science of hiding messages inside of images, and it got me to thinking: “what if the message wasn’t hidden in the image,but instead the message defined the image?” Now I’m not the artistic type- the kind of art I am interested in is methodical (read… paint by numbers). But I am a computer scientist, and I know that a color as understood by the computer is a 4 byte unsigned integer that defines an ARGB value (Alpha, Red, Green, Blue), where each byte represents the quantity of each part. I also know that a character like the lower letter case a, is defined by one byte when encoded in UTF-8, which represents a number between 32 and 126. Therefore, 4 characters, like the word duck can actually be translated into a unique ARGB value (or for opaque colors, three characters like the word ben can be translated into an RGB value).

Thus began my experiment. I wanted to see what a translation from text characters to colored pixels would produce, and if potentially there would be meaning in art created algorithmically. The quote above is from an email I sent to Jaci describing on what I was working on. The quote has one unique feature- it is 675 characters long. If you divide 675 by 3 (the number of characters you need to get an RGB value), you get 225 – which is 15 squared. This is perfect for an image of 150 x 150 pixels, where each RGB value makes up a 10 x 10 pixel square (or 100 pixels). Here was my first result:

Photext- First Attempt

First Attempt

So when you take the above quote and translate the byte sequence into RGB values, you get this colored, kind of grey, very dark image. The reason for this is because the characters American keyboards use (and we regularly type) range from between 32-126, while colors are defined with values between 0 and 255, where 0 is black, and 255 is white. When you put an RGB value together, the amount of each value determines the color, so (255,0, 0) is red. (0, 0, 0) is black (no color), and (255, 255, 255) is white (all colors). If the values are even, say (64,64,64) you get grey, darker grey is numbers closer to 0 and lighter grey is numbers closer to 255. So because the alphabet we normally use is so close together, and very low numbers- you get a grey, dark image. Especially since the most common character we use is the non-breaking space: ” ” which is 32.

To fix this, I shifted the values for the characters by distributing them evenly across the range 0-255 (for the nerds out there, I used the formula: Math.ceil((charCode – 32) / 94 ) * 255)). This had the effect of spreading the values of characters farther apart, and making them less grey, it also lightens the image, because the top values are closer to white again instead of black, and I got this:

Photext - Shift

Shifting the colors to a full range

Much, much better! And actually, an interesting result because all the colors are there- reds, greens, blues, yellows- that is what I was hoping for! I was afraid that a direct byte by byte translation would produce uninteresting results (monocolor, shades of the same color, etc.). There were still a lot of greys and whites, and I thought this could be the fault of the most common character, the non breaking space, which my shift gave a value of 0. So I made the non breaking space 255 (all white). This was a failure, because it lightened the image up too much:

Photext- White Shift

Making space, " ", white

I had been hoping that maximizing a color value might make the color more vibrant. But as it turns out, spaces are only one of two other values, that probably are also high. so, instead a 0 value actually makes the other two colors more vibrant, while the the 255 value, just makes things whiter. Blocks might not be the most interesting result, so I tried random scatter of the 1o0 pixels:

Photext - Scatter

Scattering the pixels randomly

But this has a problem, when you inject randomness into the formula, you can’t translate the image back to text. So my original idea of an algorithmically generated piece of art, whose meaning could be read in plain English, would be lost. This image could be translated into any text, not just the paragraph that you read in the beginning. Blocks are boring, so next I tried 100px lines, I hoped that this would give me the fluidity I was looking for:

Photext- Lines

Lines instead of blocks

Alas, it did not, it gave me three columns because 100 is 2/3 of 150, so the images was broken into thirds. I am now currently experimenting with some other manipulations including spirals, alpha gradient overlays, gradients, frequency histograms, etc. If you have any ideas, please comment them, so I can try them out!

So what is the end result? As you read from my paragraph, I’m thinking about making this mainstream by creating a web application for tweets. Of course, now that I’m posting this as a blog, anyone can steal that idea and run with it. Maybe I should grab the domain tweettoart.com! But I’m thinking this could have bigger implications. I also mentioned that I could uniquely turn my web pages into graphics, or even bigger- I could take the entire New York Times website on a particular day and turn it into some sort of graphic! If you have ideas for applications of this, please comment below and let me know!

Also, I’m hoping that using these methods, we could find interesting images, where text can be written specifically to produce an image. Then both the image and the text would have meaning! Imagine creating poetry that designed an image (hint, hint, Devi and Bethany!) That would be really cool.

If you want a copy of the program in its current format to install on your computer (It’s Adobe AIR), so that you can play around… email me or comment below. I’d be happy to share, just keep in mind that it is experimental, so you might be able to break it.

31

03 2010

Keywords Visualization Part 2

Keywords Visualization v2

Keywords Visualization v2

See the live visualization at: http://www.bengfort.com/keywords/

Version 2 of the Bengfort.com Keywords Visualization is now complete. This new update includes tool tips (hover the mouse over a node and the tool tip appears) that describe the data in plain language- in this case, how many times the keyword appears in Bengfort.com (note that multiple appearances in a single post are all counted, as opposed to the number of posts that include the keyword, I hope my grammar was clear enough to explicate this). In addition, now the strength of the links is now visually shown via color and thickness. This actually expanded the graph, so I may have to refactor so it all fits on the screen. Finally, if you click on a node, it searches Bengfort.com for the last 7 posts that include the keyword and present those posts to you!

Additionally, the graph has now been made clearer. Link strength is identified on a 5 point scale, however I have excluded strengths of 1 or 2, this immensely cleans up the graph and only shows the most relevant links. While this means some links no longer appear, other connections have suddenly become more apparent. (I’m especially fond of the link between our disciplines and technology!)

This version finalizes the tutorial on SitePoint.com with modifications made to make it work for Bengfort.com. I know this graph is still far from perfect, and I intend to continue to explore making this application more usable. To that end, any feedback from you guys would be much appreciated! For instance, tell me what you see, and if there is anything I could do to make it better. (Remember, I don’t know what you see, I only see what is on my browser). If there is any functionality you think should be included, if there are any keywords you want added, please let me know! I know that it is easy to read the post and to forget about it, but the more you think about it, the better I can make it!

Once again, the link to the visualization is: http://www.bengfort.com/keywords/ check it out!

17

12 2009

Data Visualization of Bengfort.com Keywords

Data visualization of frequence of Bengfort.com keywords

Data visualization of frequence of Bengfort.com keywords

See the live visualization and play with the Spring-Graph structure at: http://www.bengfort.com/keywords/

I’m constantly amazed about how people can manipulate data and statistics any way they see fit to make their own point. I don’t know about you, but whenever someone gives me the “numbers” I’m very skeptical of where they came from. Just consider the fodder from our so called major news networks that John Stewart has to make fun of! I think I’m right in saying that people are all too willing to believe “numbers” just because they look science-y or there is a pretty bar graph. Even simple inspection will reveal flaws- percentages that don’t add up to 100 or whose summation far exceeds 100. Graphs that use highlighting and weighted fonts that don’t necessarily apply to a distribution, or the simple omission of keys (legends) that would prove an opposite point.

That’s why the science (and art) of data visualization so appeals to me. We have learned bar graphs, line graphs, and pie charts since we were in elementary school, but these are the tools that are so often used to mislead us: simply because they are too simple to hold the complex data that we are now used to analyzing on a daily basis. Data visualization attempts to take complex data sets and graphically represent them in a way that humans can instantly comprehend their meaning. Visual cues including size, color, shape, and difference are all used to represent some form of data. With the growth of web technologies and web databases, an ever increasing number of amazing and interesting data visualizations has appeared, and soon I believe that elementary school kids will be taught even more complex data structures.

So, when I got a tweet from Sitepoint.com concerning building a keyword visualizer with Flex- I knew that this would be perfect for our website. So I read the article and built a version of what they used for our website! (Note that at this point, I’m still awaiting the third part of this three part series, and then I will continue to make my own customizations, so stay tuned for more updates with the visualization!) Essentially what is happening is that a script goes through our blog database and picks out keywords in all the posts. Keywords that appear in the same post are considered linked. For instance, by writing science and technology in this post together, those two words will now have a link between them. In addition the script counts the occurrences of the keyword as well as the occurrences of the links. (If you’re keeping score- this is a server side PHP script that outputs the results in a JSON file format).

The visualization is handled by Adobe’s Flex framework combined with the SpringGraph API. The more a keyword appears, the larger its nodes will be- in addition, the higher the count of links between keywords, the larger the link will be. Distance is also a factor- the larger keywords are on the outside, with the lesser keywords on the inside- they “repulse” each other by the strength of their links. Now, by simple inspection we can see that Guyana linked with Recipes and Cookbook- is by far the largest part of our website. Benjamin is connected to China and Ballet (don’t know why) while cat and dog are so closely connected that they are almost touching! You can see how this provides basically a topical analysis of our blog!

I know you guys may not find this particularly interesting, but I hope you can grasp how much data has been distilled into an easily viewable graph- we have over 600 posts in our blog, each with about 700 words in them- all distilled into an easily comprehensible visual medium. As our blog changes, so will the graph. I think that in all our fields- International Relations, Political Science, Anthropology, Business, International Education, and Computer Science- this is extremely relevant, and I hope that you guys will make use of the tools that I have shown you. (speaking of those fields, I should probably add them as key words!). If you are interested in doing any sort of complex visualization- trust me, I’m your guy to develop an application for you that will do it!

See the live visualization and play with the Spring-Graph structure at: http://www.bengfort.com/keywords/

10

12 2009