Hello and welcome to my little nock of the internet. Here I have my blog which mainly contain posts about tech, the outdoors, cooking, and some times mead brewing.

A bit of background information before you step into my world of crazy. I am Lars, a.k.a. looopTools, a Software Engineer living in East Jutland, Denmark. I have 10+ years of experience from industry mainly through student positions, but also as self-employed consultant, or full-time employee. I mainly work in low-level user space, high-level kernel space, and storage systems in general. Besides research and software development, I also love the outdoors and try to go out as often as possible (not enough at the moment) and I am an aspiring author currently working on a few different novels. I also dabble in being a more advance home cook and baker, which you may see some posts about. Finally I like the ancient art of brewing mead, a.k.a. honey wine, and experiment with different flavour combinations and ageing times.

Let me explain what I meant!!!

01 September 2021

In July I replied to a tweet by Scott Hanselman and where my reply was apparently offensive.

The tweet both part 1 and 2 is here:

Whiteboard programming has multiple benefits.
a) it helps gauge the communication skills of the applicant 
b) catch “fraudulent” devs who cannot code without autocomplete, intellisens, and so on 
c) (personal opinion) it exposes better what the applicant is thinking 1/2

2/2 because when people sit with a keyboard they seem to be more quite, 
where at the whiteboard words flow more freely

The main issue people had was the following “b) catch “fraudulent” devs who cannot code without autocomplete, intellisens, and so on”. There was a couple of issues

  • People apparently do not know what fraudulent mean
  • and thought I said that all who use intellisens and auto complete are not developers

In this post I would like to fully explained what I mean with b) to address the issues people clearly had with it. So first of, fraudulent does not mean a cheater (look it up people), it is synonym for one who is deceiving, i.e., some one who wants to give the impression that they are a developer but is not, in the case. NOW! The reason this is at all relevant for me, is because I have been part of hiring committees a couple of times, where an applicant could not even write a simple for loop on the white board. In one case it was so bad that I wrote the following on the whiteboard and asked the applicant to replace the blank spots with the appropriate tokens.


for (___ in ____):
 print(___)
 

But the applicant was unable to fill in the code, even though the applicants CV listed the last position as a software developer (for a fortune 500 company) working in Python for roughly five years. Therefore, to me an the other members of the hiring committee this applicant was a fraud. Sadly this is not the only time I have experienced this and therefore I am a firm believer in white board programming and its ability to expose “fraudulent programmers”.

Back to the fraudulent versus cheating, people thought I meant that people who are depended on auto complete and other intellisense tools are cheaters. I would like to emphasis here that it is okay to utilise intellisense tools, if they make you a better developer or improves your performance. However, I do believe that you should know the basics of the languages you use (not the none standard library libraries) to be able to write trivial programs and use the basic data structures without the need for intellisense. Now why do I think this? I believe this because I have seen and know developers who in, for instance, C# translates for-loops to LINQ expression which they are unable to read, or translates a function to a lambda expression even though they cannot read it. Why, you ask? The argument is (and I kid you not) and I quote: “It makes me look smarter to management”. Even though code well written in simple structures can be a) more effective and b) easy to maintain later.

[LaTeX Part 1] Advice for smart things to do in LaTeX

27 May 2021

Throughout most of my Academic career I have used LaTeX for everything and when I say everything I mean everything. I use it for papers, books, presentations, and more. In my opinion it is so much easier to use than Microsoft Word, LibreOffice Writer, and Apple Pages, although these have their benefits as well. In this post and maybe future post, I will cover some things I do to make my life in LaTeX even easier.

1. Use vector graphics for your Figures and Plots

When reviewing papers one thing I see often is that plots and figures are pixelated and in most cases this is due to authors not using vector graphics (I have been at fault for doing this myself). Therefore when you make your figures and plots use a tool that is able to generate vector graphic output. a) It allows you to scale the image as you please and b) you do not get the ugly pixelated pictures any more.

If the tool you use provide Embedded Post Script (EPS) as an output option, the you can use the epstodpdf package to automatically convert your EPS figures to PDFs without additional work. Sample code for this is shown below:

    \usepackage{graphicx}
    \usepackage{epstopdf}
    
    \begin{figure}
        \includegraphics[scale=.5]{PATH/figure.eps}
    \end{figure}

If you decide to go with Scalable Vector Graphics (SVG) instead you might run in to problems. There are native LaTeX solutions to handle SVG but none of them have provided me with a result I like. My recommendation here is to convert the SVG to EPS and then use epstopdf.

2. SVG names for colours

There exists a package for LaTeX called xcolor which provides definitions for different colours, allowing us to write colour names such as red, green, black, and so on. However, what many do not know is that there is an option to xcolor which also grants access to colours using their SVG names, these are often easier to “read” when printed in my experience. You can see the different SVG color names on page 38 and 39 of the xcolor documentation. In the documentation you will see that there are the options of dvipsnames and x11names as well. These two options are good to and can be used in tandem with svgnames but dvipsnames have colour definitions that clashes with svgnames definition. All three options are great, I just prefer svgnames.

Additionally, it is (in my experience) beneficial to provide the option to the document class as it propagates to other packages besides xcolor. To provide the option to xcolor the usepackage statement has to look as follows;

\usepackage[svgnames]{xcolor}

3. Keep one BibTex file

One thing I have been extremely annoyed with during my PhD in particular is that I have copied BibTex entries from one BibTex file to another, between different papers. To me this was not a good solution but the only one I really could find. However, some one made me a suggestion and I like it. Have a single BibTex file with all references you have EVER used and have it a git repository. Then when you have a new paper, add it as a submodule and push change to the repo when you add a new reference. This is something that I will start to do from now on.

4. Keep the BibTex file organised

Another element of BiTex files is that they often are left unstructured, which is super annoying if you want to navigate them. I suggest ordering them by category and put in small comments about what a give category is and stop using the doi or another id for the BibTex reference name, give the entry a good name easy to remember why it has the name (long names are not always bad!). I even know people who have a small description of each BibTex entry as a comment just above.

5. Get to know your compiler

Get to know the different parameters and options you can parse to your LaTeX compiler. Some times you will be surprised what the compile can do for you. As an example I use pdflatex and one option I constantly use is -jobname where I can set the name of output PDF.

6. Do what you can do in LaTeX, do it.

This may or may not be controversial. So, what do I mean, well if you can do something in LaTeX do it in LaTex do not use another tool. Again, what the hell am I talking about? Well I mean instead of using an external tool for making figures, the use TikZ or if you need to make a plot use pgfplots. Before you flip the table and rage quite HEAR ME OUT. Even though the initial learning curve for packages such as TikZ is step, it is time well spend. As you become more proficient with the package and start having your own little macro collection you will become more effective. Additionally, you will have more control over your stuff as the complete landscape of LaTeX comments are at your disposal.
So try to do it.

./Lars

[PostgreSQL] Select from list instead of tabel

20 April 2021

One of the things I love with SQL in general and not just PostgreSQL is the ability to say SELECT * FROM tbl WHERE column IN LIST, where the list is an input. But for a while, I have wanted to do the revers SELECT * FROM LIST WHERE LIST. Element IN (SELECT column FROM tbl) and I could not figure out how to do that. However, I could not find a good way for how to do this UNTIL NOW.

We have a table with a column called fingerprint, and we want to check if for each fingerprint in a list if the element is in the table tbl. Just for fun, let us say that the table fingerprints make up [1, 3, 5, 7] and the list of fingerprints we want to check with is [1, 2, 4, 5, 8] and we want to know what fingerprints from the list is in tbl, then we can formulate the query;

SELECT * FROM (values (1), (2), (4), (5), (8)) as v(fingerprint) 
    WHERE v.fingerprint IN (SELECT fingerprint FROM tbl)

This will give us the result [1, 5] and we can ask the inverse what elements of the list is not in tbl

SELECT * FROM (values (1), (2), (4), (5), (8)) as v(fingerprint) 
    WHERE v.fingerprint NOT IN (SELECT fingerprint FROM tbl)

For me, both of these queries are super powerful, and I will add the ability to express these queries QueryC++

./Lars

Introducing QueryC++

14 April 2021

I love C++, and I love PostgreSQL. What I do not like is writing raw SQL queries using string in C++. First of all, I do not like to see SQL query strings in C++, and two developers (my self included) sometimes make syntax errors that we cannot catch on compile-time. Therefore, to reduce the potential for errors, I have been looking for libraries to solve this. Sadly the libraries I found were either out-of-date or tricky to set up correctly and was too complex for simply generating queries. For this reason, I decided to create a new library called QueryC++ to provide a bare-bone simple library for generating SQL queries in C++. Additionally, the purpose is also to provide a modern library that takes advantage of the latest features in C++17 and 20.

The library is open source under the BSD license and is available on GitHub: looopTools/querycpp. The project is still in its infant stage and not ready for production usage, but I wanted to share information about it such that people can follow along in the development process and give me input.

Additionally, the project serves as a learning experience for me as I plan to include build-chains for CMake and a Conan package, where I have limited experience. Besides CMake, there will also be provided build scripts for waf.io, and standard Make to accommodate most macOS, Linux, and BSD users (sorry, Windows, you will have to wait a bit).

ROAD MAP: The current road map for the project is to provide full support for PostgreSQL SQL syntax supporting all our favourite commands. The first step will be to get basic SELECT statements of the ground and build up from there. Hopefully, we will see an alpha release of the project in late April start May.

Plan for other databases: The benefit of most relational databases is that their languages are very similar, and queries (most of the time) works across SQL dialects. But some (quite a few) special cases exist, and we will have to address them to make QueryC++ work with other databases, and this is the plan. I just decided to start with PostgreSQL because it is the database I favour working with. The plan for adopting other databases is that when I am adequate satisfied with PostgreSQL compatibility, I will start implementing support for MariaDB, which also should include a large amount of MySQL.

What is this project not! This project is not meant as a replacement for jtv/libpqxx but rather to work with PQXX where QueryC++ will make it easy to build your queries and PQXX will make it easy to connect to and execute your queries in a PostgreSQL database.

./Lars

[C++] std::string contains a character or a substring

14 April 2021

I am a little late to this game (and will explain why later), but I often find that students need to identify if a string contains either a single character or a sub-string. Which, in most languages, is pretty easy to identify as they provide a contains() function on strings which returns a boolean to determine if the string contains the character or sub-string. However, C++ does not provide such a function (yet), and to use Boost to get this functionality is a bit excessive. But is it easy to do in C++? Well, based on the solutions I have seen from students, the answer is no! and the main reason for this is that they do not know modern C++ or, more specifically, std::string.

Before I show how I would address this issue, I will show some solutions to this that students have shown me. One solution is to find a single character, and one is to find a sub-string, with both being based on for-loops. For finding a single character, the students would iterate the string to identify for each index if the element at that index match the character and return true` if it does.

bool contains(const std::string& str, const char sub)
{
    for (size_t i = 0; i < str.size(); ++i)
    {
        if (str.at(i) == sub)
        {
            return true;
        }
    }
    return false; 
}

This solution is not bad or incorrect. But a question I often ask the students is; “If you do not need the index, why use”. Next, I will suggest that if they insist on using loops, then they should use a for-each-loop as it reduces the risk of index errors. This morphs the code above to the one below. This solution is essentially no different than the students’ solution, and it is just “safer” to use. But we will make it much better.

bool contains(const std::string& str, const char sub)
{
    for (const auto& elm : str)
    {
        if (elm == sub)
        {
            return true; 
        }
    }
    return false; 
}

Now for identifying if a string contains a sub-string, it often looks similar to this code below. This code has the same risk of indexing errors if we are not careful, but it is also much more challenging to change this from indexed based loops to for-each. Additionally, we have to remember to break the loop and return if we find the information we are looking for, all in all. Not nice compared to, for instance, Python, where we simply can call contains.

bool contains (const std::string& str, const std::string& sub)
{
    for (size_t i = 0; i < str.size(); ++i)
    {
        if (str.at(i) == sub.at(0))
        {
            bool found = true;
            for (size_t j = 0; j < sub.size(); ++j)
            {
                if (str.at(i + j) != sub.at(j))
                {
                    found = false;
                    break;
                }
            }
            if (found)
            {
                return found; 
            }
        }
    }
    return false; 
}

But how do we make this code simpler and safer to use? Well let us take a look on basic_string what we will see is that string has a function called find which returns a size_type which can be compared to std::string:npos. A cool thing here is that find works with both a char and string input, so instead of having different functions for contains, we can easily define a single function.

#include <string>

template<typename T>
bool cotains(const std::string& str, const T sub)
{
    return str.find(sub) != std::string::npos;
}

The benefit of this solution is that it completely removes the need for loops (loops exposed to us) and indexing. Additionally, it is super easy to read. If you want to test the function compile this code:

#include <string>

#include <iostream>


template<typename T>
bool contains(const std::string& str, const T elm)
{
    return str.find(sub) != std::string::npos;
}


int main(void)
{

    std::string base = "abba"; 
    std::string sub = "ba";
    bool _contains = contains(base, sub);
    std::cout << _contains << "\n";

    sub = "da"; 
    _contains = contains(base, sub);
    std::cout << _contains << "\n";

    char csub = 'a'; 
    _contains = contains(base, csub);
    std::cout << _contains << "\n";

    csub = 'f'; 
    _contains = contains(base, csub);
    std::cout << _contains << "\n";
}

Now, this is the best solution I know of in C++11/14/17/20. But remember that I wrote C++ does not have a contains function yet? Well, with C++23, this will change as the contains function will be added with the new standard, and if you follow the link, you will see that what we have implemented above is the same as what is intended to be used in C++23.

./Lars