This project will showcase how to create an interactive data science dashboard using GridDB and Streamlit.
Using past projects’ data ingested into GridDB, we will be using Streamlit to visualize New York City Crime Complaints. Streamlit makes it quick and easy to develop any kind of dashboard with a variety of widgets that can be used for input and output. In this project, we use the input widgets to build a GridDB TQL query string and a chart output widget to show the returned data.
In this tutorial, we will import comma separated (CSV) data into GridDB using the popular ETL tool, Apache Nifi. Nifi is a enterprise-ready data plumbing platform that is highly configurable and extensible.
ETL is an acronym that stands for Extract, Transform, and Load, which simply means copying a data source from one format to another where the data is contextualized differently between the source and destination. While there are many advanced enterprise ETL tools, many developers have used basic text processing tools like awk, sed, and grep to build rudimentary ETL pipelines.
In this tutorial, we will cover three different…
Last year we released a guide/tutorial on how to ingest data using GridDB and Kafka. In that guide, we walked developers through the process of feeding data from a CSV, into the console producer, through Kafka and then through the GridDB JDBC Sink through to GridDB itself.
For this blog, because of the new update to the JDBC Kafka Connector, we will be going backwards. That is, we will be using the GridDB JDBC Source to move data from GridDB the database, through to Kafka and out to the consumer (or another Kafka Sink plugin).
This article continues the series on building a Python-based real-time reporting tool for IoT data. As in the first article of this series, we use IoT connectivity data. The dataset contains timestamps, events, sim card IDs, and data usage. This data is typically sent by the IoT devices in the background to let the support engineers do monitoring and timely troubleshooting.
In the previous article, we built a real-time dashboard that pulls data from the GridDB database and refreshes data visualizations automatically. In this article, we will build a simple application with Python’s kivy package. …
In a previous blog on Docker, we ran the GridDB server in one container and the application in another. It worked well but there have been many requests to run GridDB in a container on Docker Desktop and the application on the host.
It should be easy right? Wrong. On a Linux host, it just works but on Windows and MacOSX hosts, the different networking stacks don’t allow direct routing to the container which prevents the usual GridDB configuration from working.
After spending time trying to make Windows or MacOSX Docker behave like Linux, a trick was discovered when configuring…
In the part-1 of the blog, we implemented GridDB python script to save and retrieve the Twitter data. In this blog, we will continue with the sentiment analysis and visualization of the sentiment data. We will calculate the sentiment values for each and every tweet, store the sentiment values, and visualize them to draw useful insights for the popular fashion brands. Furthermore, we will also implement some data science algorithms like Hierarchical Cluster and visualize it using Dendrograms. …
In this tutorial, we will see how to analyze time-series data stored in GridDB using Python. The outline of the tutorial is as follows —
1. Loading the dataset using SQL and Pandas
2. Preprocess the data to deal with null, missing values, etc.
3. Build a classifier for our data
This tutorial assumes prior installation of GridDB, Python3, and the associated libraries. If you have not installed any of the below packages, go ahead and do it before continuing with the tutorial. 1. GridDB 2. Python 3 3. GridDB Python Client 4. NumPy 5. Pandas 6. Matplotlib 7. Scikit-learn…
This blog is a continuation of our previous cryptocurrency blog found here
With Bitcoin currently leading the cryptocurrency market, as of this writing, its market cap stands at $1,083,949,725,691l; and by the time we publish this blog post, it may go even higher, all thanks to its high volatility.
Satoshi Nakamoto invented the Bitcoin network in 2009, which was then followed by the cryptocurrency’s first-ever recorded commercial transaction where Laszlo Hanyecz, a programmer, purchased two Papa John’s Pizza for 10,000 bitcoins. Back then, the coin didn’t hold any real value and wasn’t that big of a deal. But in the…
Today, we will cover how to scrape data from any website using Python’s library Scrapy. We will then save the data in a JSON and HTML file. Finally, we will see how we can also store this data in GridDB for long-term and efficient use.
This post requires the prior installation of the following:
We also recommend installing Anaconda Navigator, if not already installed. Anaconda provides a large range of tools for data scientists to experiment with. Also, a virtual environment can help you meet the specific version requirements while running an application without interfering with the actual system paths.
GeoHashes are alpha numeric encodings of areas on earth. The more bits in the encoding, the more precise the area is prescribing. A 1-digit long Geohash covers approximately 2500km while an 8-digit long Geohash covers approximately 20m. For example, the 9q geohash covers most of California and the Western United States while San Francisco / San Jose bay area is spread over several geohashes, 9qb, 9qc, 9q8, and 9q9.