clang vs gcc diagnostic flags

I switched to clang instead of gcc at some point (whenever at least it is possible). This was around the time when I had still worked for Nokia. The aim was to improve the codebase, while I cannot report on that for numerous reasons like I no longer work there.

I still remember a few differences like clang liked to depend on include order, and gcc was more relaxed; or that clang was keen on dropping objects it did not thing is reachable causing a linking nightmare (that was the last thing I tried to resolve).

Since that time, the current project that I tend to compile regularly is the syslog-ng. It does in fact compile with clang, even before I know the project. The part that bugs me, is rather connected to the diagnostics(warnings).

The project has a nice set of warning configured, but some are gcc-ism, and keep my console busy every time I compile. I set a one day project to clean up, but first to understand those warnings, and fine possible alternatives in clang.

The list of warnings that bugs me:

  • -Wcast-align
  • -Wmissing-parameter-type
  • -Wold-style-declaration
  • -Wsuggest-attribute=noreturn
  • -Wunused-but-set-parameter

My environment

on my computer

clang version 8.0.0 (tags/RELEASE_800/final)
gcc (GCC) 9.1.0

travis (which we use to compile)

clang version 5.0.0 (tags/RELEASE_500/final)
gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4

Oh and I am also limited to C99, well at least I can thing of other project (c11).


-Wold-style-declaration

Sample code:

const static int foo = 2;

This tries to prevent the switch between const and static as the static const int would be the new standard order to follow.

Mostly gcc is going to support it, as explained above:

gcc -Wold-style-declaration -c -o /dev/null old-style.c
old-style.c:2:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration]
    2 | const static int foo = 2;
      | ^~~~~

clang also give a warning, but not what we would expect:

clang -Wold-style-declaration -c -o /dev/null old-style.c
warning: unknown warning option '-Wold-style-declaration'; did you mean '-Wout-of-line-declaration'? [-Wunknown-warning-option]
1 warning generated.

I could not find an alternative for this in clang, even tried with -Weverything (thanks for that clang).

-Wmissing-parameter-type

As per gcc documentation suggest, it detects the following issue

void foo(bar) { }
gcc -c -o /dev/null old-style.c -Wmissing-parameter-type
old-style.c: In function ‘foo’:
old-style.c:2:7: warning: type of ‘bar’ defaults to ‘int’ [-Wimplicit-int]
    2 | float foo(bar) {}
      |       ^~~

Let’s try again

gcc -c -o /dev/null old-style.c -Wmissing-parameter-type -Wno-implicit-int

No, silencing the implicit int does not help either.

Maybe the following would trigger:

float foo(int bar);

float foo(bar)
{
   return bar;
}
gcc -c -o /dev/null old-style.c -Wmissing-parameter-type -Wno-implicit-int
gcc -c -o /dev/null old-style.c -Wmissing-parameter-type
old-style.c: In function ‘foo’:
old-style.c:4:7: warning: type of ‘bar’ defaults to ‘int’ [-Wimplicit-int]
    4 | float foo(bar)
      |       ^~~

Nope!

Let’s see the source of gcc to find out what is happening here. Gcc is nice enough to have a test case for this option:

int foo(bar) { return bar; }

Which in fact just results the same -Wimplicit-int warning, even the tests asserts for that. Something seems fishy here.

Finally after digging up the code responsible to trigger this warning:

          if (flag_isoc99)
            pedwarn (DECL_SOURCE_LOCATION (decl),
                     OPT_Wimplicit_int, "type of %qD defaults to %<int%>",
                     decl);
          else
            warning_at (DECL_SOURCE_LOCATION (decl),
                        OPT_Wmissing_parameter_type,
                        "type of %qD defaults to %<int%>", decl);

The trick is to have an ancient skeleton (c89), no thanks.

What about clang ? Well… that does not have the exact option, but it has -Wknr-promoted-parameter, although it has a slightly different meaning, and as such does not provide any warning with any of the above example.

In the end the -pedantic is to save the day, it works with both gcc and clang, sadly I cannot use it either (there are a few things to be solved before).

-Wunused-but-set-parameter

Sample code:

void foo(void)
{
   int bar;

   bar = 2;

}
gcc -c -o /dev/null old-style.c -Wextra -Wall -pedantic
old-style.c: In function ‘foo’:
old-style.c:4:8: warning: variable ‘bar’ set but not used [-Wunused-but-set-variable]
    4 |    int bar;
      |        ^~~

clang -c -o /dev/null old-style.c -Wextra -Wall -pedantic

This is something that clang does not yet support, good for gcc users. Better for those whom uses both.

-Wsuggest-attribute=noreturn

Sample code:

void foo(void)
{
   while (1) ;
}
gcc -c -o /dev/null old-style.c -Wextra -Wall -pedantic -Wsuggest-attribute=noreturn
old-style.c: In function ‘foo’:
old-style.c:2:6: warning: function might be candidate for attribute ‘noreturn’ [-Wsuggest-attribute=noreturn]
    2 | void foo(void)
      |      ^~~
gcc -c -o /dev/null old-style.c -Wextra -Wall -pedantic -Wmissing-noreturn
old-style.c: In function ‘foo’:
old-style.c:2:6: warning: function might be candidate for attribute ‘noreturn’ [-Wsuggest-attribute=noreturn]
    2 | void foo(void)
      |      ^~~
clang -c -o /dev/null old-style.c -Wextra -Wall -pedantic -Wmissing-noreturn
old-style.c:3:1: warning: function 'foo' could be declared with attribute 'noreturn' [-Wmissing-noreturn]
{
^
1 warning generated.

It seems that the -Wsuggest-attribute=noreturn and -Wmissing-noreturn are the same in case of gcc, while clang only support the later. I would rather have an option not to allow non-returning function to exist; but hey this is a good-enough alternative.

-Wcast-align

Sample:

int main(void)
{
    char foo[] = "foobar";
    int bar = *(int*)(foo + 1);
    return 0;
}
gcc -c -o /dev/null old-style.c -Wcast-align
clang -c -o /dev/null old-style.c -Wcast-align
old-style.c:6:16: warning: cast from 'char *' to 'int *' increases required alignment from 1 to 4 [-Wcast-align]
    int bar = *(int*)(foo + 1);
               ^~~~~~~~~~~~~~~
1 warning generated.

Note: I compiled this with x86-64, and it seems gcc takes the ABI into account, as if I grab arm:

arm-gcc -c -o /dev/null old-style.c -Wcast-align
old-style.c: In function 'main':
old-style.c:6:16: warning: cast increases required alignment of target type [-Wcast-align]
     int bar = *(int*)(foo + 1);
                ^

Well it kinda feel bad that gcc despite asking it for this warning straight ignores it, but hey use clang!

Conclusion

If any can be made. First, of course I try to upstream at least a few of them https://github.com/balabit/syslog-ng/pull/2810. I had some trouble with clang, and ended up not forcing some of these. It looks like clang in fact looks inside the macros and reports errors when they are expanded (probably macro expansion happens before error reporting or in parallel), while gcc is a lazy dog do not even bother with the macros.

If you project uses a lot of macro, you should consider clang at least to compile with; or better avoid macros at all cost! But now I do not know if I should list as a pitfall of macro usage or gcc…

Play Tic-Tac-Toe with syslog-ng

Background information

The fun experiment below all done using syslog-ng an application for handling (that is, collecting, sorting, and so on) logs. Normally the functions above would be in focus, but not now. I would like to explore different aspects of the application, which including some configurations that may seem nothing short of magical (at first sight, at least).

About a year ago I shared a configuration that calculates the value of PI. The post (Calculate PI with syslog-ng) explains in detail how that configuration works. While it was fun to create, it lacked one important thing: user interaction.

I was in a need of a simple game (and to be honest, mechanics are not always that simple in the background), both when it comes to visuals (as only console support is available) and game logic. As a result, I have decided to re-create a classic: Tic-Tac-Toe.

There were some rules I set up for myself:

  • The original syslog-ng code was not to be modified to achieve any of this (namely, the code was to remain upstream and releasable)
  • No language bindings were to be allowed, as it would have been an easy task to write those things in different programming language and just connect the proper bindings afterwards.

Tic-Tac-Toe

Without further ado or lengthy explanation, just grab the latest syslog-ng (at least 3.22.1 version is required) and the following configuration file: https://github.com/Kokan/syslog-ng-conf-lang/blob/tic-tac-toe-blog-post/tic-tac-toe.conf. Make sure you start syslog-ng in foreground mode without any debug/trace level enabled: *syslog-ng -F -f *

> syslog-ng -F -f /tmp/tic-tac-toe.conf


  X |   |
 -----------

    |   |
 -----------

    |   |
 -----------
 Tic-Tac-Toe
You are with O, please provide a step (eg.: b3):
b2

  X | X |
 -----------

    | O |
 -----------

    |   |
 -----------
 Please give your next move:

and so on.

The only significant difference between this configuration and the one calculating the value of PI is that it can wait for, and react to user input. The easy part is to listen to user inputs via console, as there is already a stdin source.

Reacting to user input

There are a few ways to approach this as well. Junction and channel are fairly old features that, which could create a conditional path. However, in the meantime, if-else (essentially a syntactic sugar for the previous) has also become available. As of now, there is no switch-like method to react to messages.

Nevertheless, there is also a cooler way to do it; since this patchset syslog-ng#2716 there is a way to map values into a different value, as follows:

template "state1" "state2";
template "state2" "state3";
template "state3" "final";


$(template state1) # => "state2"
$(template ${current_state}) # => it depends on value of ${current_state}

Using this method, it is possible to create a switch-like case for values. Naturally, this could be always written as an if-else, but it would take much more configuration and thus complexity would quickly raise. Logging at the current Tic-Tac-Toe implementation, it may seem as a problem. In fact, it feels more natural to write a state machine via this way.

The following example describes a set of possible steps:

# Moves

template "X00000000+12" "XO0X00000";
        template "XO0X00000+13" "XOOX00X00"; #X win
        template "XO0X00000+22" "XO0XO0X00"; #X win
        template "XO0X00000+23" "XO0X0OX00"; #x win
        template "XO0X00000+32" "XO0X00XO0"; #x win
        template "XO0X00000+33" "XO0X00X0O"; #x win
        template "XO0X00000+31" "XO0XX0O00";
                template "XO0XX0O00+13" "XOOXXXO00"; #x win
                template "XO0XX0O00+23" "XO0XXOO0X"; #x win
                template "XO0XX0O00+32" "XOOXXXOOO"; #x win
                template "XO0XX0O00+33" "XO0XXXO0O"; #x win

The set of X, O and 0 describes the state of the game, followed by the step provided by the user, mapped into the next step. Imagine just this subset of states with if-else.

Waiting for user input

This part was the actual puzzle for me. Actually, what I needed was a way to somehow store a state, and make it accessible in the message provided by the user (don’t forget that we have messages to deal with). The idea was to have one message from the user, and another one to store the state. Initially I was thinking about creating two logpaths for each of those, but in the end I found that even when using two logpaths, merging the messages is needed.

Luckily, a parser called grouping-by already exists, and, that does exactly what is needed here. For this parser, is possible to provide a key which groups messages together, and then any number of the messages’ contexts can be merged.

After this the logic is fairly easy:

  1. Generate an initial message with an initial state.
  2. Receive user input or receive input from the previous loop with a state (similar to the pi configuration).
  3. Wait for user input, and a state message.
  4. Act upon them: display, game logic.
  5. Forward the new state to the next loop and go back to step 2.

Now let us see the above within the configuration:

# MAIN

log {
        source { tic-tac-toe-initiate-game-state(); };
        source { tic-tac-toe-input(); };
        source { tic-tac-toe-state(); };

        parser {
          grouping-by(
             key("")
             aggregate(
               value(".pass-throu" "1")
             )

             trigger( match("1", value(".tictactoe.input")) )
             timeout(9999999) #hopefully never
          );
        };

        parser { tic-tac-toe-move(); };

        filter { match("1" value(".pass-throu")); };
        rewrite { set("0" value(".pass-throu")); };
        rewrite { set("0" value(".tictactoe.input")); };
        rewrite { set("0" value("MESSAGE")); };

        destination { tic-tac-toe-tui(template("${.tictactoe.state}")); };
        destination { unix-stream("/tmp/tic-tac-toe.sock" template("$(format-ewmm)")); };
};

What’s next?

There are a few ideas to act upon, but coming up with, and then solving these kinds of issues requires a certain state of mind. All the same, solving more complex issues with wringing code is tiresome, so maybe I’ll act upon that project next… :)

Budapest Startup Safari

Budapest Startup Safari

Startup SAFARI is basically days of open doors for startup ecosystems: startups, companies, VCs and accelerators open their doors for attendees. Attendees register for sessions hosted by participating companies in their offices, travel around the city following their individual programs, interacting, learning and networking. Everyone is welcome to join.

/ https://budapest.startupsafari.com/how-it-works/ /

This conference took 3 day from which the last only has one event I did not attend to.

By nature it involved a lot more self organising and travel. By the time that I have the ticket, a few presentation was fully booked. It is kinda both good and bad, it is good because you do not travel a lot without getting in, but bad because I saw the signs that not everybody went to the presentation who made a booking. Possible there could be some space to fill in, or one could choose to sit on the floor. But anyway, it required a lot of planning between sessions.

The list below does not cover all of the presentation I have attended, just a few I though at least worth mentioning.

Intellectual Property Protection

Recently the topic got my attention, as this area is mostly unknown by developers but should be important as well. For example when somebody choose to use a component. This presentation was mostly about some facts, and general advise based on their experience prior to IP. It covered - mostly Hungarian - rules about patent, copyright, design protection.

One of the interesting thing was that the presenter estimated that a patent eligible worldwide would rathly cost 70-80 million Hungarian forint (~220000 Euro), and the time it take is about 0 to 20 year; I mean the time to acquire. I still think the 20 year seems unrealistic as it gives you a protection for the exact same amount of time.

There were a few more tips like do not publish paper on your patent before applying; as prior publication - even if you were the on published - make your patent claim void. That kinda strange to me, but hey that’s the rule to follow.

There were stories about abusing the patent system by some company (like companies creating medicine) as submitting a request for a patent does not cost you the amount of money described above, or the amount one patent would cost. But it can prevent other companies using the technology described in the patent, as other company could get a patent for it. By the time it turns out the original submitter do not want to commit to that patent, it does not matter as the technology became obsolete.

I would say I did not get what I wanted originally, but this presentation with examples was good to hear.

HOW TO BUILD A CYBERSECURITY STARTUP FROM ZERO!

This presentation was held by Balázs Scheidler (one of the founder of the company I was last hired).

I could not find a recording on this, but it would made this description just and much shorter. He told two stories happened in parallel during the life-cycle of Balabit. One of the part focused on the business part, including points where he fought was critical phase. One of the example was about external founding, in order to get founding sooner.

He also spoke about syslog-ng, and how it became and why was it initially open source. What steps led to the current state of syslog-ng. This was interesting to me as I am working on the syslog-ng project as a full time developer.

DRONE FLYING…

I went there mostly because of go with embedded chips. There were nice demos with a drone flying, doing flips and following human face if it is in WiFi range… I wanted to see the code doing all of this, but that was not a focus of the presentation. Hey there is a cool project called gobot.io: *Gobot is a framework for robots, drones, and the Internet of Things *. It supports things like Arduino or openCV, seems a reasonable thing to check out for automation.

A HISTORICAL OUTLOOK ON X86 PLATFORM SECURITY

For me this was the best presentation, which was mostly about Intel Management Engine, that kinda has access to everything (ram, disk, CPU) even if the machine is turned off. The good thing there is no official way to turn it off :)

The Intel ME is actually a computer on the motherboard, which runs MINIX! The strange thing about ME, that Intel does not give an official way to turn it off despite a lot of request. Because of the effort people put into this to hack it, and reverse engineer, it turned out that there is an option to disable it, which Intel confirmed: As Intel has confirmed the ME contains a switch to enable government authorities such as the NSA to make the ME go into High-Assurance Platform (HAP) mode after boot.

The presentation went on 30 minutes, which gradually became more like a conversation, things like Talos II come up, or RISC-V based open design.

Conclusion

Generally I disliked the involved travel, because I missed a few presentation or been late; well majority of people were late as everything got delayed. Also I though that more development is going to be involved, but a few presentation was just directly marketing. What would made the experience better to provide some way to plan route to reach each session. For example it could ask me for my main transportation (like public transport, car, etc) and ask some API like google maps to give estimate time between sessions, that would helped a lot, and possible I could attend more presentation.

FOSDEM#19

FOSDEM

This was my second year visiting this conference, and I still think many years are going to follow. If you are into Free Open Source Software this is a conference you should visit at least once, the rest is going to follow.

Day 0

I had some time, so I have decided to try out volunteering as well. At least that was the initial plan. I had a few setbacks with this, as initially my flight was scheduled later than starting time of Friday’s build up. Also the flight got delayed, so I could only arrive around 16:00.

They have just finished all of the tasks, despite of my unsuccessful contribution I still got a volunteer t-shirt. Thanks for that.

Day 1

My plan was to stick to each day to a devroom as much as I could. In the morning you could get a seat in the room, it is much harder after. There are a lot of people. Also there is no break for eating, so you should somehow come up with a schedule to ignore some presentation; or just bring food before it starts. That is what I did.

The track I have decided to follow was about risc-v. I started to follow this new ISA in a few months for now, so it seemed natural.

LLVM+Clang for RISC-V

The presenter had an overview about upstreaming risc-v support into llvm. There were a brief summary what kind of tasks exists for a new architecture to be supported. The current state is working, but there are missing improvements like position independent support, or outliner. I was surprised hearing something like outliner exists, sure it makes sense if you optimize for code size, but I have never had to deal with software where it did matter. A little more details about outline-ing: J. Paquette “Reducing Code Size Using Outlining”

Materials: https://fosdem.org/2019/schedule/event/riscvllvmclang/

Debian, Fedora

There are people whom are intrested in risc-v from major distributions and do make the extra effort. Both debian and Fedora has regular builds. It seems to me that Fedora is in a much better place (even gdb works). When I first tried out the Fedora port it was not that usable, now I am thinking doing the experiment again. I would love to see syslog-ng compiling and running, and a little curiouse about testing as Criterion could be hard to bootstrap first :)

Debian: https://fosdem.org/2019/schedule/event/riscvdebian/
Fedore: https://fosdem.org/2019/schedule/event/riscvfedora/

Buildroot

Recently the open source project I am working on got report from a buildroot maintainer; I did not know this project before, but I pretty much liked it at first glance. It makes easy to do different kind of builds combining musl, glib or uclibc with a lot of packages. It does even has a CI testing different kind of configurations, which proves our software could compile for such setups.

As of now it also supports risc-v, and it makes my job so much harder. I could simply configure the same way as linux kernel (menuconfig). There is still task to do, as I have a segfault to investigate in a few software.

Materials: https://fosdem.org/2019/schedule/event/riscvbuildroot/

OpenSBI

Materials: https://fosdem.org/2019/schedule/event/riscvsbi/

This is a straigth forward thing, somehow most people do not like the Berkly BootLoader (BBL), as it tends to be not so flexible. (I do not know better.) This initiative provides an alternative solution with possible plugin capability.

The end of the presentation was a disaster, as it turned into a license flame war between BSD and GPL; that was my first live flamewar. They went on for a few minutes not letting the presenter finish his talk. It cannot be heared that well because the mic was only for the presenter.

Google Summer of Code meetup

I had to cut short the day, as there was a planned meetup organized by google. They provided the place, beer, fries and burgers. I could not count the number of people, but the time was finite. I talked mostly with three persons from PostgreSQL, openstreetmap and GNU Radio. I was happy because at least on of them know what syslog-ng is, they actually using it to monitor their build infrastructure. It was a very basic usecase.

The biggest topic was about openstreetmap using PostgreSQL and the way they are using, it was a very intense but rewarding discussion.

Thanks Google organizing this event.

Day 2

This day started well with some energy dring spilled into me at the tram. I have already left my room, and almost arrived at the conference; having extra t-shirt in my backpack I have decided to just quickly change, so nothing could come between me and the LLVM block.

Roll your own compiler

Materials: https://fosdem.org/2019/schedule/event/llvm_irgen/

If you ever want to create your own language and you need a compiler fast; you could speed things up with LLVM. This talk is about bringing back Module-2. After some time of Bison/flex work this can be rewarding. Maybe I’ll consider rewrite one of my compiler with LLVM.

BCC

Materials: https://fosdem.org/2019/schedule/event/llvm_bpf_rewriting/

I could only quote from the FOSDEM site: “The bcc project [1], mostly known for its collection of Linux tracing tools, is a framework to ease the development of BPF programs for Linux.”. I could not understand most of the presentation, but it was good to make me look into the topic in details.

Debug info improvement

Materials: https://fosdem.org/2019/schedule/event/llvm_debug/

This just tells a story how they ported a gcc feature into lldb, which can help deubbing source compiled with -O2 or higher by walking the stack.

LLVM for the Apollo Guidance Computer

Materials: https://fosdem.org/2019/schedule/event/llvm_apollo/

This is more like a fun hobby project rather then usefull, it tries to compile C program into the Apollo Guidance Computer’s assembly language. This project is not complete.

syslog-ng with python

Materials: https://fosdem.org/2019/schedule/event/python_extending_syslog_ng/

The last talk for me was about syslog-ng extension with python. The python is an easy way to make quick demos, or PoC works; but it can be suefficent to solve very specific problems as well.

Conclusion

FOSDEM 2019 was as fun as last year, it has its difficulties the diversity of people and presentation make it worth participating. I hope next year I could participate as a presenter as well.

include random

Formal education

Mathematics

At the university I had a few class, which touched the topic about random, and I do not mean some pseudo-random, rather from the pure abstract mathematical point of view. We had our fun at that time.

IT

Of course at computer science classes we had random number generator, or did we ?

I had an exam, where we must wrote multi-threaded application in Ada language. It was some kind of simple producer, consumer problem, the only part I was lacking to generate a random number. I have totally forgot how to do it in Ada, and we cannot use any manual or help for the exam. At that point either I could fail the whole exam, or cheat (That is something I disliked at university.)

In the end I came up with a rather error-prone random number generator:

  return 4;

At least it compiled, and I could finish the important part of the exam.

<random>

This is a nice extension of the standard C++, while I can write code and still think I am a math person. You can watch a nice presentation about the library.

Click on the picture to watch!