Coding Phase - Week 7
This week I spent my time parallelizing the CAPTCHA Monitor using processes on the host machine. Previously I was using Docker swarm to replicate the instances of the code, but it turned out to be slow and memory consuming. Instead, I used Python’s multiprocessing library to replicate the workers. I needed to make a few changes in the architecture to make this happen. I needed to separate the code that manages Tor and Tor Browser from the main program loop. Now, the main program loop creates instances of that code in separate processes and makes sure that they keep running. By using the updated code, I started collecting data one more time. Every day I collect data for a different metric.
The next step was to display the collected data in a dashboard. You might remember that I mentioned a dashboard already and put a screenshot of it. Actually, that was the second dashboard solution I tried. In the very beginning, I tried using Graphana. It is a really neat open source dashboard solution, and it has well-designed layout options. These are all great features, but Graphana is geared towards time series data like the temperature of a CPU or amount of ram usage of a computer. So, the data sources and the backend are designed for that kind of data. It also doesn’t provide flexibility with data manipulation. Grafana wants to display what the database query returns directly on the dashboard. Unfortunately, I needed more flexibility in the way I process data, and I needed to combine multiple queries sometimes. Still, I used Graphana for a while to see if I was wrong and I wasn’t wrong.
I did further research, and I found Metabase, which is another open-source dashboard solution. As opposed to Graphana, Metabase had all the flexibility I needed in the backend to process data before showing them on the dashboard. I really liked using Metabase, but it had a lot of flaws on the frontend. For example, some of the graphs were clipped for no reason, and there was no option for fixing it. It was also consuming a lot of memory on my VPS, and I thought I could use that memory for data collection rather than spending on the dashboard for no solid reason.
So, I ended up building my own dashboard using Node.js, Bootstrap, Chart.js, and Express.js:
I used my learnings from my weeks of dashboard search to create something simple and elegant. I used Node.js & Express.js on the backend to create an API and Bootstrap & Chart.js on the front end for displaying data. The cool thing is I can process the data in the way I want on the backend and send it to the dashboard through API. If I don’t like anything about the frontend, I can just change it! Sure, I could do changes in the other open-source dashboard solutions as well, but I needed to go through an unnecessary amount of steps to achieve it. Also, now I can use the same backend API solution for other purposes. I was already planning to have an API for third parties to fetch data from the system, and there I have it!
Finally, I spent some time moving my project to Tor Project’s new GitLab server. Previously, code, issue tracker, and wiki page were all on different locations. Now, they are all in the same place and unified. GitLab also have a lot of extra productivity tools, and I can’t wait to use them. Here is the new home for my code: https://gitlab.torproject.org/woswos/CAPTCHA-Monitor