Google Fusion Tables
Fusion Tables works with Google Spreadsheets much better and faster than its cousin . Google Fusion Tables is an incredible tool for analyzing data, visualizing large data sets, and creating maps. Unsurprisingly, Google’s incredible mapping software plays a big role in promoting this tool in the software rankings.
Lately, interactive infographics have become an increasingly popular tool for journalistic content creation. One of the key products for collecting, sharing and visualizing large amounts of data has been a tool like Google Fusion Tables, launched in June 2009. It’s a Web-based application that can create charts, tables, graphs, layouts and maps in minutes.
Fusion Tables stores information in the form of multiple tables that can be viewed and downloaded by Internet users. The essence of the service is to share, collect and visualize these tables. The functionality provides the ability to build pie charts, scatter charts, histograms, line graphs, time charts, and (the most common application) geographic maps. Data is exported in CSV format.
Structure of Fusion Tables
Queries to the service come from several possible sources:
- Fusion Tables website;
- standalone applications that use the API;
- visualizations embedded in other web pages (e.g., diagrams).
Next, query-based map layers are generated, which can be spatial or structured. The foreground database manager converts the queries into a generalized representation and passes them to a specialized processing module, which then creates a query project. The project is executed with the help of a structured back-end database using a set of synchronously replicated Bigtable servers. The main task of Replicated Storage is to process hundreds of thousands of tables with different schemas, sizes and characteristics of the queries being loaded.
Storage Stack
Fusion tables are built on a two-tiered storage stack in Google.
Bigtable: the tuples stored there are key-value pairs, sorted by key and distributed across multiple servers based on their ranges. Part of a tuple value can be a complex value of some normal form (except the first one). Bigtable provides a write operation that atomically inserts a new tuple. The service also provides three read operations:
- Key search, which retrieves a single pair with a given key;
- key prefix search, which retrieves all pairs with a given key prefix;
- key range search, which retrieves all strings between the start key and the end key.
In addition, Bigtable records the transaction history for each tuple. That is, internally a tuple is stored as a “key, value, timestamp” triplet, where the timestamp captures when the tuple was written. In general, one key value is used in multiple versions of tuples.
Megastore is a library above Bigtable. It provides higher-level primitives such as sequential secondary indexes, multiple row transactions, and sequential replication. The specified library is used:
- To support property indexes,
- for providing table-level transactions,
- for replicating tables across multiple datacenters.
Fusion Tables API
An important aspect of creating a data management and collaboration platform is to allow developers to extend the functionality of the platform. This is accomplished through the use of an API. It allows developers to create applications that use FT as a database. For example, the website mtbguru.com, which hosts a collection of bike routes, wrote an application that synchronizes them with Fusion Tables. The API supports querying, modifying, and defining the data using the appropriate operators and instructions. At present, modifying Fusion Tables schemas through the API is not supported, and existing standards are used for authentication and data access.