Data flow expresses the concept of streaming data in and out of a system in order to extract information from one or possibly several sources. The incoming data may be incomplete, requiring validation or needing to be translated from one format to another. Possible operations that may be required are transformations, filtering, sorting and grouping. Here are three examples of data flow, the last being more complex than the first two.
Numbering, Filtering and Formating
asia_top_ten.taq selects the first 10 Asian cities from a list of 35 mega cities. The cities are numbered from 1 to 10 and the population numbers are formated according to the system locale. This is a sample of an incoming datum:
The first value is a rank out of 35, followed by city, country, continent and population. This is the program:
template asia_top_ten
{ integer rank }
(
city = Megacity, country = Country, population = Population.format()
query<axiom> asia_top_ten (mega_city : asia_top_ten)
These are the first two cities of the solution:
asia_top_ten(2, city=Delhi, country=India, population=26,580,000)...
We see the double question mark criterion used with a non-trivial boolean expression that not only filters cities, but
increments the 'rank' variable too rank++
.
Note that population value produced by the built-in format function will vary according to system locale.
Numerical Analysis
The TAQ language allows a flexible approach to dealing with data in tabular formats, for example spreadsheet data. The following example shows how a sizeable and incomplete table of values can be input into an TAQ script to perform an analysis and produce readable data.
more_agriculture.taq has a "more_agriculture" query which produces a list of countries which have increased the area under agriculture by more than 1% over the twenty years between 1990 and 2010. If you look at the agriculture-land.taq file, you will see data for 210 countries with percentage values for each year spanning a 50 year period. Some of the values are 'NaN' indicating data not available.
include "surface-land.taq"
template increased
(
country ? agri_change > 1.0
template surface_area_increase
(
integer surface_area = (increased.agri_change)/
query<axiom> more_agriculture
(agri_decades : increased) ->
(surface_area : surface_area_increase)