Think you know everything there is to know about Star Wars? Think again.
A team of computer scientists at l’École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland used graph theory and computer science to weed through hundreds of pages about Star Wars and reveal details that even the most avid fan may be unaware of.
The Star Wars Universe spans more than just the seven films. It is developed through other media such as computer and video games, television series, and hundreds of books. No matter how keen a fan you are, it would be hard to explore the whole universe and remember all of the details. The team used a set of Star-Wars-themed Wikipedia pages called Wookiepedia for their data collection.
SEE ALSO: Millennium Falcon vs. Starship Enterprise: Neil deGrasse Tyson Weighs In
“Fans will be surprised to learn, for example, that we came up with over 20,000 characters,” said Kirell Benzi, a PhD student and the project lead. “Among them, 7,500 play an important role. There are also 1,367 Jedi and 724 Sith. All the characters are spread among 640 different communities on 294 planets. And an analysis of the 10 largest communities reveals an aberration: nearly 80% of the galaxy’s population is human,” explains an EPFL news article.
The 36,000-year saga can be broken down into six periods: before the Republic, the Old Republic, the Empire, the Rebellion, the New Republic, and the Jedi Order. So, it makes sense that the team of researchers wanted to map out where each character fit in. However, in many cases, it wasn’t as simple as just plotting them on a timeline based on a date.
The researchers looked at how each character was connected to every other and used that to figure out when they are part of the storyline: “Using these cross-references, we are able to accurately determine the time period of the character almost without fail, when this information is not directly provided in the books or movies.”
Images courtesy of Kirell Benzi
These two images show how it was done. Each time period is indicated using a particular color. In the image above, characters for whom the time period is unknown are shown in black. In the image below, the black dots are replaced with the color representing a best guess based on the other characters they are connected to.
Images courtesy of Kirell Benzi
Although it is neat that they were able to use this kind of connection graph for a fictional universe, it is even more interesting to think about this type of method being used on real historical data. We could fill in knowledge gaps in all kinds of fields.
“The program maps out connections in the mass of unorganized data available on the net,” said Benzi. “In addition to extracting data according to extremely precise criteria, the algorithms can also create links among data points, sort them, quantify them, interpret them and find missing information. All this in very little time. The results are then presented in the form of interactive charts that are easy to read and understand,” explained the EPFL news article.