The “Goodness” of Blobbing 11/18/2017

What We Need

Today, we work on implementing Ferrer’s code in terms of our own program. This includes understanding its input and output, and using or adjusting them to our needs.

Ultimately, what we need is for a preliminary score of “goodness” based on Ferrer’s code to be automatically exported in the same text file containing the rules for recreation of an image. This should be done in our own program, and shouldn’t require much extra work for the user.

We will start by understanding Ferrer’s code and what it does.

Code Review

Firstly, we look at what Ferrer’s code does.

Essentially, it clusters all the colors in an image through using k-means to sort the image into an amount of colors. This amount is determined by the user. Then, it goes through and finds the different “blobs” in the image, are clusters of the same simplified colors.

Blog Detections
This image was sorted into two colors, the current blob found is shown in red, although it found 317 total blobs.

A couple things noticed were that the user is required to input the colors the image is sorted into, the more colors allowed the more blobs found, and that the program will consider even one pixel areas as blobs.

This needs to change, firstly, we need k-means to be inside the code, and not chosen by the user, secondly, we need the program to sort our blobs into small, medium, and large for us.


The real scary part is messing around in Ferrer’s code. Now, it is attached to our code, and as such, I begin to butcher it according to what we need.

Butchering it
This is a start to editing Ferrer’s program.

Next, we have to read the code and find out whether or not blobs hold the data needed to determine how large they are, if we can access it, and if we can edit it in any way for our purposes. Luckily, Ferrer’s code is incredibly well organized.

With some minor adjustments, I was able to make functions return the values I wanted, and write some code in the controller to do two things.

  • Have the program tell us the number of large, medium, and small blobs
  • Have the program fill in the large, medium, and small blobs different colors than red.

Visually, it was necessary to do both of these things, since I next need to answer two questions:

  • What defines blobs as large, medium and small?
  • What are “good” ranges for a picture having these?

For this, I will go back to the Matisse “Seated Ruffian” and theorize visually how many large, medium, and small shapes I think it has. This, I will use to determine how to judge the size of blobs, and from there, the “goodness” of having that many blobs.Matisse back at it again

From here I can mess around with the numbers in java to make the large, medium, and small shapes fit this new standard.

Currently, big is set to be 25,000 pixels or greater, medium is 1000 pixels or greater, small is less than 1000 pixels, and tiny is less than 100 pixels.

Our Matisse has:

  • 2 large blobs
  • 11 medium blobs
  • 21 small blobs
  • 495 tiny blobs

We could say that a image is “good” if it has  2 large shapes, 11 medium shapes, and 21 small shapes, while maintaining less than 500 tiny shapes. Anything that strays from these values should be worse.

From here we will come up with a “goodness” value for other images, based on the scale of the Matisse.

output percentages
Our example output of “fitness” or percentages based on these shapes. On Monday, I will have a more detailed discussion on how to better calculate these percentages.

For now, we will continue to try and not only identify shapes, but identify lines.


bob the line blob

Lines aren’t just vertical and horizontal, they are also diagonal and most importantly, they can curve. To find this, I will have to…

  • identify individual blobs in our blob list that qualify as a line by how thin they are
  • find any diagonal neighboring “lines” and string them together to make one big “line”
  • keep track of the “lines” in a list so we can easily count them and go through them.

Blobs don’t have width or height though, rather, they consist of a LinkedHashSet of points. This very literal representation won’t do for finding lines however.

To find the approximate height and width of a blob, I will find two important points on the blob. The first point is the point with the lowest x and y value on the blob, the second important point is the point with the highest x and y value on a blob.

These are roughly illustrated below, with some code I have written to find them.

Wanted but not having
These we will use to define the height and width of a blob, which we will use to find horizontal and vertical lines.

Now that we can find a blobs height and width, how do we define a line? To do this, I made an example image, and based my definition of a line around what I felt was right. It is important to note that a lines definition is not a static number, but rather changes in comparison to the width and height of the image.

What constitutes a line


The program has been edited to :

  • Sort blobs into set sizes
  • From these sizes, come up with a crude “goodness” measure
    • This measure will need to be discussed Monday
  • Find vertical and horizontal lines
    • Small shapes are also found to be lines, these either need to be counted as part of a larger line, or removed from the list. Effective ways to do this should be discussed

After next week we will look at fixing these bugs with the ultimate goal of eventually putting together a good aesthetic measure.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s