Democratizing fast access to Big Data

26 Apr 2016 | , , ,
Share on:

For many years business users have had both limited and slow access to the massive data sets their organizations own. Their data has been locked up in databases and data warehouses (now called data lakes).  Typical users have never seen their actual data. Rather they have had to wait for limited summaries provided by either BI tools or expert coded data pipelines using multiple technologies and tools. These complex ETL data pipelines are written in complex queries and programs often using multiple tools.  In most cases the cost of accessing the whole dataset is such that analyses must be run as batch jobs.

Meanwhile kdb+ empowered analysts have been able to concisely express their entire tasks in a few statements of q, enabling them to obtain results in seconds or minutes that previously took hours or days.  Q is a powerful language used extensively in finance, for IoT and for other time-series applications, which is executed by the underlying k data language. Q is a simple functional data language which both simplifies and extends SQL.  Both the queries and the analytics are in the same language and execute inside the database.

It is natural to ask “Can’t we bring the power of kdb+ to users who are non-expert programmers?”  The answer is “Yes, of course we can.”  This is now being done through visual and textual DSLs which are easier to learn and designed for the data analyst and end user.  They can have their information expressed in visual analytics such as heat maps or in operational dashboard charts.

These new UI’s are democratizing the use of massive data sets, giving all business users and analysts access to their complete datasets, as well as giving them a much more powerful means for visualizing and analyzing their data.

Dave Thomas is chief scientist at Kx Labs and was cofounder of Bedarra Research Labs. Prior to developing complex commercial applications in kdb+ Dave was known for his contributions to Object Technology including IBM VisualAge and Eclipse IDEs, Smalltalk and Java virtual machines. He is a thought leader in large-scale software engineering and a founding director of the Agile Alliance.


Kx Insights: Machine learning and the value of historical data

2 Aug 2018 | , , ,

Data is being generated at a faster rate now than ever before. IDC has predicted that in 2025, there will be 163 zettabytes of data generated each year—a massive increase from the 16.1 zettabytes created in 2016. These high rates of data generation are partially an outcome of the multitude of sensors found on Internet of Things (IoT) devices, the majority of which are capable of recording data many times per second. IHS estimates that the number of IoT devices in use will increase from 15.4 billion devices in 2015 to 75.4 billion in 2025, indicating that these immense rates of data generation will continue to grow even higher in the years to come.

SEMICON 2018 Snapshot: Data and the Era of AI

24 Jul 2018 | ,

By Bill Pierson The future of the semiconductor industry is looking bright judging by the breadth of new developments, initiatives and innovations on display at SEMICON West in San Francisco this July. Industry leading companies presented the latest technical and business insights into today’s opportunities and challenges, particularly in the areas of smart manufacturing and […]