I bet I’m not the only one who feels like this.
Therefore, with conference season upon us, I thought I’d share some advice on how to network. In particular, I want to share the two golden rules that I follow at any networking event, and at conferences in particular: the PacMan Rule and the $n$-a-day Rule.
So you’ve made it to the conference, and to the opening reception. You are holding a plate with some fancy bites that you just picked up from the buffet in one hand, and a drink in the other hand. You look around the room and see all these little groups of people, talking together, with their backs to the outside world.
How do you join a group? The only option, really, is to open your mouth and ask. That means having to interrupt a conversation, and you don’t want to do that. Hence, you hang around awkwardly in the vicinity of one or more of these groups, hoping to catch a lull in the conversation that you can use to squeeze in a quick “Hi, may I join you?” You have to be quite loud about it, also, since the room is rather noisy with all the chatter.
Not ideal.
Sure, people will likely let you join, but it would be great to avoid the awkwardness. Also, it can be quite off-putting to have to ask. If you have to ask to join a group, maybe that group is not the most welcoming anyway?
Here’s what I try to do to lower the threshold for other people to join the conversation: I apply the PacMan Rule.
I did not come up with it. I believe it was Eric Holscher who introduced it. The rule is simple:
When standing as a group of people, always leave room for 1 person to join your group.
To visualise it: stand in a PacMan formation. The people in the group form the PacMan circle, and the open spot is the open mouth.
As soon as someone joins the group, shuffle over to make room for one more. When the group gets too big, you can always split off with a few people and start a new, smaller group, in PacMan formation.
This makes it very easy for someone to join without having to interrupt the ongoing conversation. It is a non-verbal way of creating a welcoming and inclusive environment, and reduces the feeling of there being closed cliques that outsiders do not have access to.
Try it!
I do not know who came up with this, so if you do know, please tell me who to credit!
This rule is also about welcoming people. When you have been integrated in your community, you have likely made researcher friends. At this point, you find yourself looking forward to conferences as an opportunity to catch up with them.
However, fresh graduate students, or even established researchers joining a new field, do not have such a friend group yet. They need to network to find one, but it can be quite daunting to insert yourself into a new community if there are already these established cliques of friends.
The $n$-a-day Rule is designed to encourage both new people and established researchers to network. The rule is as follows:
If this is your $n$ th time attending conference X, make sure to introduce yourself to to at least $n$ new people, on each day of the conference.
As a young researcher, you’ll probably have an easy time doing this. If you’re attending IJCAI for the first time, it should not be too hard to find at least one person you haven’t met yet on each day.
As an established researcher, you will have to make a conscious effort to welcome the newbies. Again, at IJCAI this probably won’t be too hard. During smaller, more specialised conferences, though, you may have to introduce yourself to every single new person who is there, to make your quota.
I think that’s good. After all, at the specialised conferences, we are really together with our community, with the people who we will work with for maybe the rest of our careers. Hence, it’s extra important to make an effort to foster an inclusive environment that welcomes new members to the community.
If you are attending any networking events this summer, please try applying these rules. Let me know how it goes?
Would you change anything about them? Would you add any new rules? Do you apply networking rules that are different from mine?
Let me know! I’d love to get better at networking, and could use all the help I can get!
]]>On our first day in Tasmania, Hélène, Guillaume, John and I had gone on a hike to see a forest and waterfalls. We spent the second day going to Port Arthur, the famous former penal colony that was the site of the 1996 Port Arthur Massacre, which triggered Australia’s ban on guns.
On our third and last day, we would drive from Hobart to Devonport, so we could catch the overnight ferry back to Melbourne from there. The plan was to stop by Mole Creek Karst National Park on our way north, so we could visit the Marakoopa caves, which are famous for their glowworms. Even though it was winter in Australia (all of this happened on 4 September 2017), which means fewer glowworms than in summer, we were still excited to by the prospect of seeing an underground starry sky.
Or, at least, I was. Us girls had pretty much done all the planning and reservation-making and ticket-buying, under the implicit agreement that the boys would not complain about the choices we made and the attractions we visited. The Marakoopa caves had been my suggestion, and I just hoped that they would be exciting enough for the others to be happy with that decision.
Being in charge of logistics, Hélène and I had minutely planned every detail of the trip, including the routes that we would take to get to the different attractions. I had loaded the .gpx
files of the routes into my Garmin eTrex® 10, just in case all phones would fail. As we were feeding our AirBnB’s alpacas (one was called Frankie, and I forgot the other one’s name, which is incredibly irritating) one last time, Hélène looked at the map and noticed that there was an alternative route north, which would take us along some lakes, and only be a little bit slower. She proposed to take that one, instead of the one we had planned. Since seeing some lakes sounded like a good idea, and since she was the driver and therefore had veto rights on the route, we agreed, and we went on our way.
It turned out that the road that looked like one of the main roads of Tasmania was actually a gravel road through the mountains. Not a huge issue, except that we were also in the middle of a snow storm. It was -4 °C outside, and the gravel road wound its way along the slopes, which were thickly forested with icicle-adorned eucalyptus trees. We could only drive 30 km/h. Any faster, and we were afraid of slipping in a bend in the road or colliding with another car, due to the bad visibility in the snow storm. Any slower, and we were afraid that we would get stuck in the snow and be unable to restart the car. We had been warned that one of the biggest causes of deadly road accidents in Australia were collisions with kangaroos, so that worried me, too (in fact, that evening, we had to carry a dead wallaby off the road because it was blocking our way).
Two days ago, as we picked up our rental car, the rental car company had tried to sell us an upgrade to a bigger car. We had refused, but found that they had given us the bigger car anyway (likely it had been the only one they had, and they had tried to make some money off of us by trying to make us pay extra for it). Now, we were very grateful to have a heavier vehicle, which was less likely to be blown off the mountain road by a gust of wind.
As I looked at the time, my heart sank. There was no way that we were going to make the 1 pm tour through the Marakoopa caves that I had selected for us. I didn’t say anything, though. No point in stressing out our driver.
Somehow, though, we pulled into the parking lot of the Marakoopa caves visitors centre at 12:58 pm. I ran to the ticket office, while Hélène, Guillaume and John parked the car. I bought the tickets, and we ran to the cave entrance, to find nobody there waiting for us. Did we miss it? Had the guide just gone in without us?
As we looked around and tried to figure out what to do, a young woman arrived, carrying a guitar case and a violin case. Hearing us speak English in three different accents, she introduced herself as “Emily”, and asked who we were and what had brought us to Tasmania. We explained, and as I was waiting for an opportunity to ask her why on earth she was carrying a guitar and a violin, our tour guide arrived, and took Hélène, Guillaume, John, Emily and me into the cave.
It was beautiful. The limestone was reddish and yellowish, and glittered with all the crystals embedded in the stone. We heard the drip drip dripping of stalactites and stalagmites forming around us. We followed an underground river, sometimes diverting away from it, then catching up with smaller streams. Often, we had to squeeze through low or narrow passages, and all the time I wondered why Emily would put herself through all of this with a guitar and a violin.
Eventually, we reached quite a large cavern, through which a wider river flowed. The guide turned off the lights and, in the dark, we saw the small specks of lights that were the brave little glowworms trying to attract a mate. I imagined myself in a friendlier version of the Koom Valley caves that Commander Vimes finds himself in, in Terry Pratchett’s Thud!.
We moved on to the highlight of the tour: a large cavern, called The Great Cathedral. There, the tour guide dimmed the lights, and Emily took out her violin. She explained that she was a composer, and that she was in the process of composing some music. Some time ago, she had been walking on the beach, in the aftermath of a forest fire, when she saw something move. Somehow, a bird’s nest had survived the fire, and the baby chicks were hatching, shaking the ashes off their tiny little heads.
I don’t remember if she actually said this, but in my mind, Emily told us that a parent bird arrived to attend to the babies. I hope that that’s what happened.
Emily explained that she had composed a piece inspired by that experience, and was now adapting it for performance in The Great Cathedral, specifically. Every day, she would enter the cave with the first tour, and stay there until the last tour of the day would take her back above ground. In between those tours, she would be alone in the Cathedral, working on her music.
Then, the tour guide dimmed the lights, and Emily played the piece for us.
It was magical. Her music was quiet in a way that I always find hard to explain to people. How can music be “quiet”? But it was, and it was beautiful. It reminded me of Arvo Pärt’s work. Later, when I asked her who her favourite composers were, she mentioned him first.
Standing in The Great Cathedral, suddenly getting a private concert, listening to the kind of music that I loved, I tried to experience that moment to the fullest. I was such a lucky person. I had had the opportunity to travel all the way here, and to experience all of this. Happy, happy, happy. As Guillaume later described it, this experience was “awesome, unexpected, and free”.
After Emily’s concert, we left her in the Cathedral, and made our way back to the surface. When I travel, I always like to carry post cards of my hometown. That way, if I meet someone special, I can leave them a little memento of our meeting, and leave them my contact details. Since Louvain-la-Neuve, however much I love it, is perhaps not the most photogenic place in Belgium, I had brought postcards of Brussels, instead.
I had one left, of the Brabantine gothic Town Hall of Brussels, if I remember correctly, and we decided to write Emily a post card. We thanked her for her concert, and left our contact details, asking her to let us know if/when she would ever release her music. We asked the nice lady at the ticket office to give our post card to Emily.
We never heard back from her.
Until now. Sort of. I had often wondered what had happened to Emily’s project. Surely, if I were to contact the Marakoopa caves visitors centre, someone would remember an Emily who went down to the caves every day in September 2017 with a guitar and a violin to compose a piece of music? I never contacted them, though.
A few days ago, however, I decided to Google her. Emily turned out to be Emily Sheppard, and that she ended up creating a full album from her time in the caves: MoonMilk. Based on her description, I wonder if the piece she played for us was an early version of “Aftermath”.
I have purchased her album, but I haven’t listened to it yet. Part of me is afraid that the music will disappoint me, or overwrite my memory of it, precisely because it is such a precious memory of mine. Part of me is also trying to savour the experience, building up my own anticipation for just a little longer. Some music just hits me straight in my soul.
Maybe I will report back, to share what happened when I finally listen to the echos of one my most precious travel experiences ever.
Moon milk is a white, creamy substance that is found in some limestone caves. See its Wikipedia page for some hypotheses on its origin. Moon milk is also an Ayurvedic drink, which apparently is super “hot” right now, and supposedly helps you sleep.
When I asked our tour guide about the meaning of “marakoopa”, she explained that it likely means “beautiful” in the language that was spoken by the Indigenous people. The European colonisers eliminated not only most of the people, but also their language, so they do not even know that for certain. They do have indications that the caves were of cultural importance to the Indigenous people, but they don’t know much more than that. Even though I had just spent three weeks in Australia, repeatedly hearing these kinds of stories, it still broke my heart. Two days ago, we had touched trees that were old enough to have been around before colonisation. Maybe, when those trees were much smaller, Indigenous people had touched those trees also. Now, nothing but a few words and whispers of their existence were left. It is such a waste. It is such overwhelming injustice.
While writing this blog post, I made a donation to the Aboriginal Land Council of Tasmania, who are running a crowdfunding campaign to buy back land, so it can return to the lutruwita/Tasmania’s Aboriginal Community.
]]>What stood out most to me this year, is how productive and how much fun the poster sessions were! This was my fourth time presenting a poster, and I honestly never felt like I was particularly good at presenting posters. In designing the poster for our paper, I remembered the ‘Better Research Poster’ video, and tried to apply the principle to my poster, while sticking to my favourite colour scheme.
This week, Anna Latour (@aldlatour) presented our work on showing the power of encoding a problem X to another problem that's computationally harder than Y. Yes, you read it right.
— Kuldeep S. Meel (@ksmeel) August 26, 2023
We focused on the classical problem of Identifying code set from sensor placement literature..1/3 pic.twitter.com/MojX6baPBf
After my 8-minute presentation in the CSO: Constraint Programming session (which went well, enough I guess, but there’s not that much you can say in such a short amount of time), I went to the expo hall to where I had put up my poster earlier that day, and waited until someone would come over to have a chat.
The first person to do so, came solely to point out a typo in a table on my poster. Okay. Fair. I guess that means that he at least understood what the table should have shown.
The second person to come up to me, came to make excuses for the first one, and to compliment me on my choice of colour. Okay. Thanks. Could’ve been worse, I guess.
So far, my experience of this poster session wasn’t great, but entirely in line with how I had experienced poster sessions so far. However, things picked up after that! The poster sessions was only 90 minutes, but I ended up spending a good two-and-a-half hours talking to people about our paper, and about their own research.
At some point, there was nobody at my poster, so I took a quick break to find something to drink. When I returned, I found Kuldeep presenting the poster to someone. Firstly: aaaaw, thank you! Secondly: it was so much fun and so educational to hear him explain the project that I had worked on so hard. He has an enormous talent for identifying the key information that the listener needs to know, and to sell the work in a way that the audience immediately grasps its importance!
It seemed that the spicy teaser I had put on the poster as a take-home message did its job. People remembered, and even the next day I had three or four more hours of discussing our work with others. Moreover: people challenged me on it (fair!), which not only allowed me to explain and discuss details, but also taught me a lot about how to think and talk about this research. Loved all the critical questions, even if the side effect was that I wanted to take a one-week vacation to completely rewrite the entire paper.
I found it incredibly tiring to listen to 8-minute talk after 8-minute talk. I just don’t pick up that much information from these short pitches, and having to keep track of the time and the program, so I can switch from room to room just adds to the cognitive overload that I get at conferences.
Hence, I was very happy to have found the quality of the poster sessions quite high this year, and spent a lot of time in the poster room to check out other people’s posters, and to talk to the researchers who presented them. Here are five of my favourites from this year.
Ten Cate et al. | Bessiere et al. |
---|---|
My first two favourites are the posters by Ten Cate et al. and by Bessiere et al.
I was very interested to see Ten Cate et al.’s demonstration of an application of SAT solving that I wasn’t familiar with. I wanted to learn more about PAC guarantees, so the fact that they had designed a SAT-based algorithm for learning description logic concepts with PAC guarantees really piqued my interest. I must say that I didn’t have enough background knowledge on both learning and description logic to fully grasp the contents, but I appreciated their efforts to make the poster stand-alone enough that you can get an idea of the contents of the paper by just reading their poster.
I thought the idea behind Bessiere et al.’s work was super interesting. I had never considered learning the constraint language along with the constraints as a goal, but it totally makes sense from the perspective of an end-user who does not necessarily have a CP background. I also very much liked the design of the poster, with clear images and helpful colours and design elements.
Marinescu et al. | Bofil et al. |
---|---|
The next set of favourites are by Marinescu et al. and Bofil et al.
Obviously, I get excited about pretty much anything that relates to probabilistic inference in networks, so I had to check out this poster. However, even more than learning about the research itself, I enjoyed the chat I had with Radu when he presented his poster. He is incredibly knowledgeable, and I admire him for finding a way to work in industry but still do interesting and relevant academic research, and presenting it at conferences. As a postdoc looking for the next step in her career, I found my chat with him quite inspirational.
I spent quite some time thinking about SAT encodings of different kinds of problems. I did this in the context of understanding (weighted) model counters, in the context of generating test instances, and in the context of debugging solvers. Hence, I was quite intrigued when I saw the poster presenting Bofill et al.’s work. I must admit that I did not care much for the giant table of results that took up half of the poster, but I had a very nice chat with Jordi (who actually remembered meeting me in 2018), and his clear explanation of their results.
And last, but definitely not least, I liked the poster for Toward Job Recommendation for All, by Bied et al.
This poster for sure got people’s attention during the poster sessions, and I think that that’s well-deserved. I find the contribution interesting and relevant; reducing bias in the job market is obviously a task worthy of study. Additionally, the poster has an attractive design that is dominated by figures, a fun movie reference, and a clear take-home message.
However, what I probably like most, is a little easter egg that is hidden in the section headers. An earlier version of my own poster had a reference to the fabulous Daft Punk song Harder, better, faster, stronger. It didn’t quite work for me, though, and I asked the community of academics on Mastodon for help. One of them found the right words to express my discomfort with it, and I ended up removing the reference, even though I felt that it might have been an element of my poster that would attract an audience.
Hence, I was delighted to spot a subtle reference to the song in this poster. One of the authors of the poster/paper actually recognised me from our interaction on Mastodon! What’s even better: he mentioned that during the opening reception, while I was in the uncomfortable situation of having found myself trying to network with some men who were actually trying to test out their PAU skills on me.^{1} How sweet it was that another person joined that conversation because he recognised me from Mastodon. * chef’s kiss *
They actually told me to my face that the only reason they came over to talk with me was to “learn how to talk to girls”. So there I was. A grown-ass woman. A researcher with an accepted paper at a top conference. An academic looking for the next step in her career. Reduced to a “girl” for boys to practice on. ↩
Note: this blog post was written to give intuitions and provide more figures and tables to help understand the material. For a more formal description, please click here to go to a page with links to the paper, the extended version of the paper, our code, slides, a poster, a short video, and our .bib
file.
As each constraint programmer knows, the way that you choose to model your problem is at least as important as the way you choose to solve it. However, no matter how good your favourite encoding is, you will inevitably stumble upon a problem whose encoding size is so large, that you cannot store it in your RAM.
If you do not want to give up on your favourite encoding, you will have to give up on the solver that you are using, instead. Maybe you can come up with a streaming algorithm for your problem? Maybe you can let go of some theoretical guarantees?
We wondered what we could achieve by changing the encoding. What if we could come up with an exponentially more succinct encoding than that of the current state of the art? What would we have to give up on, then? Maybe we can gain an exponential reduction in size in exchange for higher theoretical complexity? If so, can we design a solver for that harder problem that is fast in practice? If so, can we use that solver to solve exponentially larger problem instances than the current state of the art?
In this work, we present a case study of an NP-hard problem, the generalised identifying code set (GICS) problem^{1}, which we reduce to a computationally harder problem: that of finding a grouped independent support (GIS) of a Boolean formula (more on that later). We then design and implement a new solver, gismo, for finding the grouped independent support of a Boolean formula, and demonstrate how effective gismo is in solving the GICS problem.
Let me explain the generalised identifying code set (GICS)^{1} problem with an example.
I am known among my coworkers as someone who gets a bit agitated about fire safety^{2}.
Right now, I live in Singapore, which is famous for many things, including the impressive Marina Bay Sands hotel (MBS):
The famous Marina Bay Sands hotel, with the ArtScience museum on the left, and the symbol of Singapore, the Merlion, in the foreground on the right. Photo by: Anna Latour |
With its 2600 rooms, organised into three 57-story towers, MBS is a little too big for us to use as a toy example, so let’s just use a cosy, imaginary hotel with only 5 rooms, instead:
We imagine that we can place smoke detectors in the rooms. For example, we could place a smoke detector in the blue room, and one in the orange room:
These smoke detectors have the property that they can detect smoke from a fire in the same room as they were placed in immediately, but can detect smoke from a fire in an adjacent room with a small time delay. We assume that fires do not have time to spread, and that smoke detectors cannot sense smoke further than one room away.
We imagine that the hotel’s concierge has a dashboard with an indicator light for each smoke detector. Hence, if a fire breaks out in the blue room at time $t_0$, the detector in the blue room immediately detects the smoke, and the indicator light for that detector turns blue at time $t_0$. We assume that a light stays on until we turn it off, so at time $t_1 = t_0 + 1$, the blue light is still on. Here is an example of the pattern that we see when there is a fire in the green room:
The pattern that we see above when there is a fire in the green room, is that at $t_0$, when the fire breaks out, none of the detectors detect any smoke. Hence, all lights are off. At time $t_1$, however, the smoke has travelled from the green room to the blue and orange rooms, so the detectors in those rooms detect the smoke, and their corresponding lights on the indicator dashboard turn on. Hence, at time $t_1$, both the blue and the orange light are on.
We call a pattern like this a signature^{3}. Ideally, each room should have a unique signature, so we know where to send the firefighters when a fire breaks out. Looking at the figure above, you have probably realised that in this case, we cannot distinguish a fire in the green room from a fire in the red room, because they would both have the same signature.
The GICS problem asks to find out in which rooms we should place a smoke detector, such that all fires are detected, and uniquely identified. The goal is to minimise the number of smoke detectors that we place.
Let us make this a bit more abstract. In particular: let us model the hotel as a graph $\Gamma := (V, E)$. Each node in $V$ represents a room, and there is an edge between two nodes if their corresponding rooms are adjacent:
The GICS problem asks to find a dominating set with specific properties. We will call this set $D$, and it will correspond to the set of rooms in which we place a smoke detector. $D$ is called a generalised identifying code set of a graph $\Gamma := (V,E)$ if each node has a unique signature. The goal is to minimise the cardinality of $D$.
We will model the signatures of the nodes as tuples of sets of nodes: $\sigma_{v} := \langle S_v^0, S_v^1 \rangle$. Here, the first element of the signature, $S_v^0 := \{v\} \cap D$, models the indicator lights that are on at $t_0$ if a fire breaks out in room $v$ at time $t_0$. The second element of the signature, $S_v^1 := N_1^+(v) \cap D$, models the indicator lights that are on at $t_1$. Here, $N_1^+(v) := N_1(v) \cup \{v\}$ is the closed 1-neighbourhood of node $v$.
Now, let’s check if our candidate solution, $D := \{B, O\}$ is indeed a solution to the GICS problem on this graph. We can simply write down all the signatures:
Unsurprisingly, we conclude that this $D$ is no solution to our problem. We can add a smoke detector to the purple room, however, and see what happens:
Now, we see that each fire can be uniquely identified. However, the problem was also to minimise the cardinality of $D$. In this case, we can do that by removing the smoke detector from the orange room:
In the context of detecting fires in a hotel, it is not unreasonable to assume that there is only ever one fire breaking out at a time. However, other applications of identifying codes include detecting criminals in social networks (Basu & Sen, 2021a), and detecting spreaders of misinformation in online networks (Basu & Sen, 2021b). In those contexts, it is reasonable to assume that more than one ‘fault’ occurs in the network at the same time.
We denote the maximum number of simultaneous faults with $k$, and it reflects the maximum number of nodes that can reasonably be expected to ‘fail’ at the same time. In our example: it is the maximum number of rooms that can catch fire at the same time.
In the example above, we had implicitly assumed that $k = 1$. However, we could generalise the concept of closed 1-neighbourhoods and signatures to be defined for sets of nodes rather than individual nodes. In particular, we can define $N_1^+(U) := \bigcup_{v \in U} N_1^+(v)$ as the closed 1-neighbourhood of set of nodes $U \in V$, and update the definition of the signature accordingly: $\sigma_U := \langle S_U^0, S_U^1\rangle$, with $S_U^0 := U \cap D$ and $S_U^1 := N_1^+(U) \cap D$.
We update the table of signatures accordingly:
Hence, the problem that we study in this work is as follows:
Given a network $\Gamma := (V,E)$ and a maximum number of simultaneous failures $k$, find a dominating set $D \subseteq V$ such that for each pair of sets $U, W \subseteq V$, it holds that $\sigma_U \neq \sigma_W$, and $| D|$ is minimised.
The approach taken by the current state of the art is to encode this NP-hard problem as an integer-linear program (ILP), and to use an off-the-shelf MIP solver like CPLEX to solve it. The problem with this approach is that it does not scale well with the number of nodes in the network $V$ and the maximum number of simultaneous failures $k$.
Intuitively, this approach explicitly encodes that all pairs of rows in the table above have to be different from each other. As a result, the number of linear constraints in the encoding scales exponentially with $k$. Hence, the ILP encoding of the GICS problem has $O(\binom{|V|}{k}^2)$ linear constraints, which grows as $O(|V|^k)$, and hence is exponential in $k$. In this work, we adapted the encoding from (Padhee, Biswas, Pal, Basu and Sen, 2020) to work for $k > 1$.
In this work, we propose to reduce the GICS problem to the computationally harder problem of finding a grouped independent support of a Boolean formula, which I will say a bit more about now.
Let’s first get an intuition for what an independent support is, and then define the concept of a grouped independent support.
Intuitively, the independent support $I$ of a Boolean formula $F(X)$ (Chakraborty, Fremont, Meel, Seshia and Vardi, 2014) is a subset of variables with the special property that for all solution of $F(X)$, the truth values of the variables in $I$ uniquely determine the truth values of the variables in $X - I$. I will illustrate this with an example.
Consider the following Boolean formula on three variables: $F(X) := (x_1 \lor x_2) \leftrightarrow x_3$. Here is a truth table for this formula, which only lists the variable assignments that are solutions to this formula (here, I am using $\sigma: X \mapsto {0,1}$ to indicate an assignment of truth values to variables, rather than a signature of a (set of) node(s)):
Let us now look at two different ways of projecting these solutions on subsets of variables:
In the left table, we see that projected solutions $\sigma_{1 \downarrow S}$ and $\sigma_{2 \downarrow S}$ coincide, e.g., $\sigma_{1 \downarrow S} = \sigma_{2 \downarrow S}$. Hence, the cardinality of the set of projected solutions is smaller than the cardinality of the set of full solutions. On the other hand, in the right table, we see that none of the projected solutions coincide: all projected solutions are unique. Hence, projecting on projection set $I := \{x_1, x_2\}$ preserves the cardinality of the set of solutions in this example.
This is another way to look at independent supports of Boolean formulae. An independent support $I \subseteq X$ of a Boolean formula $F(X)$ is a subset of variables that preserves the cardinality of the set of solutions under projection. We will use this property later on, in our reduction of the generalised identifying code set problem to the grouped independent support problem.
In our work, we define a grouped independent support (GIS) as an extension of the vanilla independent support. We assume that we are given not only Boolean formula $F(X)$, but also a partition of $X$, $\mathcal{G}$. We now ask to find a subset of that partition $\mathcal{I} \subseteq \mathcal{G}$ such that $I := \bigcup_{G \in \mathcal{I}} G$ is an independent support of $F(X)$. Now, variables are grouped together in little subsets of variables. A variable can only be added to an independent support together with all variables in its group. Hence, for each group, it holds that either all variables in that group end up in the GIS, or none of them do.
Let’s consider an example, for the same example Boolean formula as above, $F(X) := (x_1 \lor x_2) \leftrightarrow x_3$. Consider the partition $\mathcal{G}_1 := \{\{x_1, x_2\}, \{ x_3 \}\}$. Since $I := \{x_1, x_2\}$ is an independent support of $F(X)$, a grouped independent support of $\langle F(X), \mathcal{G}_1 \rangle$ is $\mathcal{I}_1 := \{\{x_1, x_2\}\}$.
Now consider the following partition: $\mathcal{G}_2 := \{\{x_1\}, \{x_2, x_3\}\}$, instead. Now, since neither $\{x_1\}$, nor $\{x_2, x_3\}$ is an independent support of $F(X)$, the only possible grouped independent support that we can find for $\langle F(X), \mathcal{G}_2 \rangle$, is $\mathcal{I}_2 = \mathcal{G}_2 = \{\{x_1\}, \{x_2, x_3\}\}$, since $\{x_1, x_2, x_3\} = X$ is an independent support of $F(X)$.
We now define the GIS problem as follows:
Give a Boolean formula $F(X)$ and a partition of the variables $\mathcal{G}$, find a subset $\mathcal{I} \subseteq \mathcal{G}$ such that $I := \bigcup_{G \in \mathcal{I}} G$ is an independent support of $F(X)$, and $| \mathcal{I} |$ is minimised.
So how do we reduce the GICS problem to the GIS problem? That’s a great question, I’d love to tell you!^{4}
As mentioned above, GICS problems are typically formulated on graphs. Given a graph $\Gamma := (V,E)$, we define two variables for each node $v \in V$. One variable, $x_v$, models the first element of the signatures. The other variable, $y_v$, models the second element. We partition the set of all variables as follows: $\mathcal{G} := \{ G_v := \{x_v, y_v\} \mid v \in V\}$. Hence, in the hotel example problem above, we would have ten variables, partitioned into five groups.
Key to our reduction is that we assume that each room has a smoke detector, and then remove smoke detectors from rooms, until we have the smallest possible set of rooms with smoke detectors, such that each set of fires of cardinality at most $k$ can be uniquely identified.
Assuming that each room contains a smoke detector, variable $x_v$ is an indicator variable that is True if the smoke detector in room $v$ detects smoke at time $t_0$. Note that this means that $x_v$ is essentially an indicator variable that is True if a fire breaks out in room $v$ at $t_0$, and False otherwise. Similarly, $y_v$ is an indicator variable that is True if the smoke detector in room $v$ detects smoke at time $t_1$.
Recall the dynamics of smoke detectors detecting smoke from the same room and from adjacent rooms. We can model these dynamics in the following set of constraints:
\[F_{\text{detection}} := \bigwedge_{v \in V} \left(y_v \leftrightarrow \bigvee_{u \in N_1^+(v)} x_u\right)\]We must also encode the assumption that at most $k$ fires break out at the same time, and do that with a cardinality constraint:
\[F_{\text{cardinality},k} := \sum_{v \in V} x_v \leq k\]Hence, we obtain the following formula:
\[F_k (X,Y) := F_{\text{detection}} \land F_{\text{cardinality},k}\]Note that we need $| V | + \sum_{v \in V} deg(v) = 2 \cdot | E | + | V | = O(| E |)$ material equivalences to encode $F_{\text{detection}}$, and that $F_{\text{cardinality},k}$ can be encoded using $O(k \cdot | V |)$ clauses (Sinz, 2005).^{5} Hence, in this encoding, $F_k (X,Y)$ has $O(k \cdot | V | + | E |)$ clauses. Note that our encoding grows linearly with both network size and $k$, whereas the ILP encoding of the former state of the art grows exponentially with $k$.
By construction, the above formula has exactly 1 solution for each signature that we need. After all, $F_{\text{detection}}$ is always satisfiable without any additional constraints. However, adding $F_{\text{cardinality},k}$ ensures that the only solutions are those in which at most $k$ variables in $X$ are True, while the others are False. Hence, we have $\sum_{j=0}^k \binom{| V |}{k}$ solutions to $F_k (X,Y)$.
The trick is to find a GIS $\mathcal{I} \subseteq \mathcal{G}$, such that when we project all solutions to $F_k (X,Y)$ on $I := \bigcup_{G \in \mathcal{I}} G$, we preserve the cardinality of the set of projected solutions, and minimise the cardinality of $\mathcal{I}$. Now, each group in $\mathcal{I}$ corresponds to exactly one node, such that the solution to the GICS problem is $D := \{ v \in V \mid G_v \in \mathcal{I}\}$.
Intuitively, we try to find the smallest subset of nodes, such that, when we project the solutions of $F_k (X,Y)$ on the variables associated with those nodes, none of the solutions coincide. By construction, each solution to $F_k (X,Y)$ corresponds to exactly one signature, and vice versa. Hence, if solutions were to coincide, this means that signatures coincide and they are no longer unique.
I know that this all sounds a little bit abstract. I will give an example below, in the explaination of how our new tool, gismo, works.
Having a succinct encoding of a problem is only of any use if you have a tool that can solve the problem when it is encoded in such a more succinct form. Hence, we need a tool that finds a minimised grouped independent support, given a Boolean formula and a partition of its variables. In our work, we introduce such a tool: gismo.
I will give a high-level overview and intuition of what gismo does and how it works, using the hotel smoke detector example problem describe above. For simplicity and brevity, let us assume $k = 1$. We now have the following truth table for $F_1 (X,Y)$:
Gismo is initialised with candidate independent support $\mathcal{I} := \mathcal{G} = \{G_v := \{x_v, y_v\} \mid v \in V\}$. Hence, the corresponding candidate independent support is $I := \bigcup_{v \in V} G_v$. Then, it iterates over the groups $G_v$ in the partition. For each group, it checks for each variable if that variable can be removed from $I$ such that $I$ remains an independent support of $F_k (X,Y)$. If both variables can be removed, then the group is removed from $\mathcal{I}$ and the algorithm moves on to test the next group. If at least one variable needs to be part of $I$ for $I$ to be an independent support of $F_k (X,Y)$, then the group remains an element of $\mathcal{I}$ and is never considered again.
As an example, suppose that gismo starts with group $G_R$, and first tests if $y_R$ can be removed from $I$:
A quick inspection of the truth table above tells us that this is fine. All projected solutions are unique, none of them overlap. Gismo then checks if $x_{\textbf{R}}$ can also be removed from $I$. It turns out that it can, so gismo removes $G_{\textbf{R}}$ from $\mathcal{I}$ and moves on to the next group.
Suppose the next group is $G_{\textbf{O}}$. A quick inspection of the table shows that both $x_O$ and $y_O$ can be removed from $I$, and hence gismo also removes $G_{\textbf{O}}$ from $\mathcal{I}$, such that what we have left is $\mathcal{I} = \{G_{\textbf{B}}, G_{\textbf{G}}, G_{\textbf{P}}\}$:
Suppose that gismo now decides to process group $G_P$, starting with variable $y_{\textbf{P}}$:
This is still fine, but as soon as gismo tries to remove $x_{\textbf{P}}$, something is wrong:
Specifically, we now see that the solutions for $\varnothing$ and $\{\textbf{P}\}$ overlap! Hence, gismo concludes that $G_{\textbf{P}}$ must be part of the GIS, and does not remove it from $\mathcal{I}$, but processes the next variable group.
Eventually, gismo finds that $\mathcal{I} := \{G_{\textbf{B}}, G_{\textbf{P}}\}$ is a GIS of $\langle F_1(X,Y), \mathcal{G}\rangle$, and returns. The resulting projected solutions are the following:
Notice how all the projected solutions are unique, and notice how they directly encode the signatures that we found for $k = 1$ and $D := \{\textbf{B}, \textbf{P}\}$ in the example problem at the beginning of this post.
A key element of gismo is checking if a candidate GIS is indeed a GIS. Following the existing literature on independent supports, we do this by using Padoa’s Theorem (Padoa, 1901). Modulo a lot of details, the way to test if a variable should be part of the independent support is to construct a Boolean formula and check if it is unsatisfiable. If it is unsatisfiable, then the variable must be part of the independent support.
Hence, gismo makes at most $2 \cdot | V |$ SAT calls during its execution. We can think of these SAT calls as calls to a co-NP oracle. Consequently, testing if a candidate GIS is indeed a GIS, is computationally more expensive than checking if a candidate GICS is indeed a GICS, since that is polynomial in the size of the encoding (which in itself can be exponential in $k$, as discussed above).
You may also have noticed that, in the current implementation, gismo is a greedy algorithm. Hence, it can only guarantee a subset-minimal GIS.^{6} However, the ILP method for solving GICS as described in the literature guarantees a cardinality-minimal solution. Hence, it is possible that gismo returns a solution with a larger cardinality than the solution returned by the ILP method.
However, there is no good reason why we couldn’t create an implementation of gismo with the same guarantees as given by the former state of the art, other than that we just couldn’t be bothered. Feel free to contact us if you’re bored and want to give it a shot!
Also: we couldn’t be bothered to implement any optimisations. In fact gismo processes the groups in lexicographical order. This is not how you would do it if you wanted gismo to be fast, and this is not how you would do it if you wanted gismo to return a solution with a cardinality that is as small as possible. There are plenty of heuristics that can be used for a speedier execution of the algorithm and a smaller returned solution^{7}, but we just never adapted them to the grouped setting. Again: if you’re bored, feel free to fork our repository and implement some of those heuristics!
Our results can probably be summarised best in the following figure:
This scatter plot shows the running times of the entire encode+solve pipelines for our GIS-based pipeline (vertical axis), compared to the ILP-based pipeline that is based on the former state of the art (SOTA) (based on Padhee, Biswas, Pal, Basu and Sen, 2020), in CPU seconds, for 50 different networks and 9 different values of $k$ (indicated by colour). The network sizes vary from 10 to over 1 million nodes, and 14 to 1.5 million edges, with median degrees varying from 1 to 78. They are real-world networks sourced from the Network Repository (Rossi and Ahmed, 2015), and include networks of different types, like social networks and road networks. Any data point below the diagonal indicates that our gismo method solved that (network, $k$) combination faster than the former state of the art. Note that the axes are in log scale. We used a timeout time of 1 hour (3600 seconds), and allowed 4 GB of RAM.
In the above figure, we see that many of the instances ran out of either time or space. The largest instance that could be encoded to CNF using our method, had 227 320 nodes. On the other hand, the largest network that could be encoded into ILP by the former SATO, had 494 nodes. Hence, our method could encode networks that were 460$\times$ larger than the largest network that could be encoded by the state of the art.
The ILP method could also solve that instance within the permitted time (even though it could only encode it for $k=1$). The largest instance solved by our method had 21 363 nodes, and it could be solved for all tested values of $k$. This is a 43$\times$ improvement upon the former state of the art.
All instances that could be solved by the ILP method, could also be solved by our gismo-based method. For those instances that could be solved by both methods, we compared the cardinalities of the returned solutions. We found that for the majority of those instances, the cardinality of the solution computed by the gismo-based method was either equal to, or at most 10% larger than the cardinality of the solution returned by the ILP-based method.
Recall that we made zero efforts to optimise gismo to make it fast or return a small solution, so we think that there is still a lot of room for improvement of these results here. On the other hand, we spent quite some time trying to improve the ILP-based method, managing to increase the size of the largest network it could handle from ~300 nodes to ~500 nodes. However, we always ended up paying either in terms of memory use, or in terms of running time. You can find a link to the extended version of our paper with more details on this here.
In this work, we demonstrated how we can solve larger problems instances and solve them faster, by reducing the original problem to a computationally harder problem, and developing a solver that was fast and good enough in practice to be of practical use.
Note that this is possible because today’s SAT solvers are so incredibly fast at solving NP-hard problems. This opens the door for us to start working on developing practical tools that solve problems that are higher up the polynomial hierarchy, meaning that we can start to model and solve problems that are higher up the polynomial hierarchy.
Exciting times!
The Identifying Code Set (GIS) problem was originally introduced 25 years ago by Karpovsky, Chakrabarty and Levitin. Despite its age, it is still being studied today. It has applications in, e.g., identifying criminals in social networks (Basu & Sen, 2021a), identifying spreaders of misinformation in online networks (Basu & Sen, 2021b), and even the deployment of satellites (Sen, Horan Goliber, Basu, Zhou and Ghosh, 2019). ↩ ↩^{2}
On one memorable occasion in Melbourne, I spent 90 minutes arguing with people at the reception desk of a hostel about the exact meanings of “light off”, “light flashing”, and “light on” in the hostel’s smoke detectors. I have since learned to always carry my own when I travel. ↩
In the literature, e.g., (Karpovsky, Chakrabarty and Levitin, 1993), a signature is also referred to as an identifying code. ↩
Is this a reference to the hilarious Elyse Myers? That’s a great question, I’d love to tell you! ↩
CNF encodings of cardinality constraints typically introduce auxiliary variables. I am omitting those from the discussion in this blog post, for the purposes of brevity (ahem), but please see our paper for more details on what role the auxiliary variables play in our approach. ↩
At least, in theory. Implementation-wise, this claim is a bit more nuanced. Please read our paper for details. ↩
So I selected seven of the books I read as a doctorate student, and bought a few copies of each. These books all changed the way I looked at the world, at science, or at myself. Some of them held a mirror up to me. Some of them inspired me to be a better person, or to work to make the world a better place. Some of them made me realise that my responsibilities as a scientist extend further than I previously thought.
Each of them helped me grow as a person, and each of them influenced my journey as a doctorate student and as a scientist in one way or another.
I distributed them randomly among my advisors, committee members, and paranimphs, with the explicit permission to each of them to swap with each other or to give their book to one of their friends, coworkers or students, if they felt that their book didn’t resonate with them and would be more appreciated by someone else.
It seems that by now, all the books that I had to send overseas have arrived at their destinations, and I wanted to share a bit about which books I picked and why.
While this book in itself is a great read, full of jokes and wisdom, it is mostly a placeholder for the The Guilty Feminist podcast. Early into my doctorate, my advisor Siegfried’s tenant recommended the show to me, and I’ve been a listener ever since. The main message of the podcast is: you don’t have to be perfect to make meaningful change.
It’s through this podcast that I dared to call myself a feminist. Deborah, her co-hosts and her guests motivated me to show up, to speak my mind, and to stand up for myself and others. Additionally, this show was a gateway drug (to borrow a phrase from DFW herself) into many modern (feminist) thinkers, with whom I connected and interacted online.
It broadened my world and my thinking. It gave me a lifeline, a place of recognition, a place of growth. It gave me joy and companionship when I felt isolated as a woman in a male-dominated field.
I hadn’t actually read this book (although I bought myself a copy and am in the middle of it now), but in a way, this is also a placeholder book for me. It stands for Nanette and Douglas, Gadsby’s Netflix Specials, as well as for her speeches (this one and this one, in particular).
I recognise a lot of myself in Gasby’s stories, and she does an incredibly good job at telling them. I’ve used both Nanette and Douglas as tools to communicate truths about myself to other people. She has words for things that I cannot say.
She has so much wisdom to offer. There’s a reason I cite her in the final chapter of my dissertation.
I read this book in less than two days. It is a memoir of a female academic, trying to make her way in a male-dominated field. I can relate. It was incredibly painful to read.
It is also a book about tree facts. Those were a joy! For months after reading this book, I infodumped endless tree facts on my friends and family. A great read!
Unlike Lab Girl, this book took me months to finish, mostly because I could only read about one chapter a week, needing time to process what I’d read.
I was taught Greek myths in school, but through this retelling of the story of Circe, did I finally relate to them in a way that made me understand why these are classics. I guess it also helped that I now have some life experience under my belt. This book really hit me hard.
For extra context about why Miller wrote this book, I can also recommend Ezra Klein’s interview with the author.
I think every scientist should read these two books by Angela Saini.
The first, Inferior, is about research into the differences between men and women. Saini does a great job at dismantling the flawed logic and assumptions behind research questions, study design and interpretation of data. She’s a great critical thinker and understands the power of the stories we tell ourselves and others.
In Superior, she does the same, but for the research into the difference between “races”. It’s an absolutely terrifying read, illustrating the destructive power of the clout that comes with academic titles.
These books stimulated me to look critically at any piece of research that is about people, or has conclusions that affect people, because they make it very clear how bias affects scientific research.
Before I read this book, all I knew about the story of Henrietta Lacks was: “White doctor takes cells from Black patient without her knowledge or consent, and gets rich and famous.”
This book provided a lot of context, from a lot of different angles. It is about journalism, science, history, and the personal histories of Henrietta and her family. I’m truly amazed by how much the entire world has benefitted from Henrietta’s cells, and mortified by how her family was treated.
This was a fascinating read, and I’m happy that Henrietta’s story has been getting some more attention the last years. Every scientist should read this book.
I hope you, reader, may pick up one or more of these books one day. Please let me know if you do, and let me know what you thought of the book?
]]>First, I thank my advisor, Dr. Siegfried Nijssen. His relentless hard work to help me finish our first paper (and the later ones) was motivating and made me feel like we were a team. His patient mentorship made me comfortable admitting my mistakes, so he could help me fix them. He taught me how to answer scientific questions, and helped me overcome adversity and failure. I treasure his scientific, career-related and personal guidance and support. Someone once asked me to describe Siegfried. After 30 or so seconds, they said: ‘‘Whoa, you must really like your advisor, because you are literally lost for words to describe him!” That’s accurate.
I also thank my promotor, Prof. Dr. Holger H. Hoos, who adopted me into his research group when I was already many months into my doctorate research, and helped me arrange a research visit to the University of Toronto. Without him, my attitude towards algorithm development and my opinion on how it should be done, would not have been the same. He created opportunities for me and taught me how to recognise them as such, for which I am very grateful. In addition, he taught me the one-liner that informs how I look at any scientific research, be it my own or someone else’s: be a benevolent critic.
I thank my other promotor, Prof. Dr. Joost N. Kok, with whom I started the PhD journey. He has been an optimistic and trusting mentor to me, which was very comforting. When it mattered, he was always quick to respond and asked those questions that I had not yet thought of.
I also thank my advisor Prof. Dr. Fahiem Bacchus for welcoming me at the University of Toronto and opening my eyes to new avenues of science and to new approaches to problem-solving.
Next, I thank the members of my doctorate committee, Prof. Dr. C.M. Jonker, Prof. Dr. K. Kersting, Prof. Dr. H.C.M. Kleijn, Prof. Dr. P. Lucas, Prof. Dr. A. Plaat, and Prof. Dr. P. Stuckey for reading my dissertation, deliberating its scientific value, and giving me valuable feedback to improve its contents.
I thank Dr. M. van Leeuwen and Prof. Dr. M. Bonsangue for being my opponents during my defence.
My work was supported by the Netherlands Organisation for Scientific Research (NWO), and I thank NWO for funding my research. I thank Google for awarding me a Women Techmaker’s scholarship, and I thank the University of Toronto for funding my research visit to their Department of Computer Science. I thank the Netherlands Research School for Information and Knowledge Systems (SIKS) for their financial support of workshops and the printing of my dissertation. Finally, I thank the Leiden Institute of Advanced Computer Science (LIACS) for funding my return to Leiden for my dissertation defence.
I am grateful to my co-author Dr. Behrouz Babaki, who was the first person who made me feel comfortable talking about my own research, by being genuinely curious in a completely honest and non-intimidating manner. He is a good coder and a great friend.
I thank my other co-authors, Dr. Anton Dries, Prof. Dr. Angelika Kimmig, Prof. Dr. Guy Van den Broeck, Daniël Fokkinga, Marie Anastacio and Jeroen Rook for all their hard work, their critical questions, their smart insights, our fruitful discussions, and all the jokes that made the pill of late-night writing easier to swallow.
I had the great fortune to work at three different universities in three different countries during my time as a doctoral student. In addition, I was lucky to retain a strong connection with KU Leuven, I am very grateful to the many, many coworkers who welcomed me there, joined me for lunch or games, and cheered up my coffee breaks.
I thank my co-workers at the Institute of Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM) from Universit'e catholique de Louvain for our discussions in the ``cafet’’ and on Slack, and for the very best social activities that academia has to offer: Gaël Aglin, François Aubry, Roberto d’Ambrosio, Vincent Branders, Simon Busard, Antoine Cailliau, Quentin Cappard, Guillaume Derval, Yves Deville, Fabien Duchêne, Benoît Duhoux, Pierre Dupont, Xavier Gillard, Olivier Goletti, Seyed Hossein Haeri, Mathieu Jadin, Nicolas Laurent, Alex Mattenet, Guillaume Maudoux, Kim Mens, Etienne Riviere, Peter van Roy, Pierre Schaus, Michael Saint-Guillain, Charles Thomas, Sascha Van Cauwelaert, Hélène Verhaeghe,
I thank my co-workers from Leiden Institute of Advanced Computer Science (LIACS) at Leiden University for the games of Codenames that we played in the FooBar, for the afternoon walks, and for all the moral and practical support: Laurens Arp, Mitra Baratchi, Alice Bisschop, Koen van der Blom, Hanjo Boekhout, Abdeljalil el Boujadayni, Yi Chu, Daniela Gawehns, Marcello Gómez Maureira, Pieter Leyman, Hugo Manuel Proenca, Irene Martorelli, Rens Meerhoff, Thomas Moerland, Gilles Ottervanger, António Pereira Barata, Bram Renting, Jan van Rijn, Lise Stork, Frank Takes, Jaco Tetteroo, Suzan Verberne, Lieuwe Vinkhuijzen, Can Wang, Hao Wang, Felix Wittleben, Furong Ye.
I am grateful to my co-workers from the Department of Computer Science of the University of Toronto for our lunches, the games of Hanabi we played over Zoom, and the updates on the well-being of office plants: Toryn Klassen, Andrew Li, Sheila McIlraith, Margarita Paz Castro, Maayan Shvo, Rodrigo Toro Icarte.
I thank my former co-workers (and plus-ones) from KU Leuven, for keeping in touch with me and inviting me to lectures, defences and drinks. In particular, I thank Jessa Bekker & Ruben for opening their home to me so many times. I am also grateful to Hendrick Blockeel, Sebastijan Dumančić, Samuel Kolb, Luc De Raedt, and Nikolina Šoštari, for including me in both social and scientific activities.
I am grateful to the system administrators who built and maintained the servers on which I ran my experiments, and to the administrative staff who helped me navigate bureaucracy. I thank the cleaners, concierges, receptionists, security personnel, mess hall employees and others who made my workspaces comfortable and safe. I thank the following people in particular: Alexandra Blank, Anthony Gégo, Vianney Govers, Alexa Kodde, Vanessa Maons, Jayshri Murli, Marloes van der Nat, Ludovic Taffin, Liselotte van der Woerd.
I am also grateful to the mentors that I found inside and outside academia, who made me stronger, wiser and kinder. In particular, I thank Irma Ravkic, whose encouragement, support and feedback helped me to open many doors. I thank fellow Google scholars Ibtihal Ferwana, Alison O’Shea, Hila Peleg, and Christina Zaga, for being such inspiring and supportive forces in my life. I thank my paranimphs Jessa and Behrouz for standing by my side, now and always.
Finally, I thank my friends and family. I do not mention them by name, because of privacy, but this dissertation would not have existed without their patience and support. They cheered me on and cheered me up. They knew me before I did. They were not scared away by my flaws, but reminded me of my talents. Not all of them are with us any more, but all of them are with me. I am grateful to have them in my life.
Note: part of this text was published in my dissertation. Since I had a strict word limit of max 800 words there, I wrote this extended version, such that I could mention more people by name. There are so many people I am grateful to, that I likely have forgotten at least one. My apologies! Please get in touch if you believe that someone is missing!
]]>Fahiem was an amazing scientist, with an enormous influence in the SAT (Boolean satisfiability) community. He has an impressive list of accomplishments (including being the chair of the SAT Association, being a fellow of the Association for Advancement of Artificial Intelligence (AAAI), and being the recipient of Canadian Artificial Intelligence Association’s Lifetime Achievement Award), and we called him “Mr. MaxSAT”. But to me, Fahiem was more than that. Let me share a few personal memories of him.
I met him on a cold and stormy Friday evening (11 August 2017, to be precise), on a pier in St. Kilda, just outside Melbourne, Australia.
I was there with some of my former coworkers from KU Leuven, to watch the penguins come ashore. We’d arrived early, and had asked a local what to expect. He told us to come back in 90 minutes, and then we’d see “thousands of penguins” come up the pier. We went to the pub to warm up a bit, and when we returned 90 minutes later, I was surprised by the lack of people on the pier. Surely, thousands of penguins would attract more of an audience? We looked out at the sea, but it was quite void of any penguin activity.
Then I heard one of the few other freezing people on the pier say to his companions: “If we’re lucky, we might see five or six penguins tonight!”
Sorry. What!?
I turned around and asked the stranger how he knew, and we got to talking. We started out with a discussion of penguin counts, but quite naturally moved on to alien abductions and the apocalypse. It was a pleasant conversation, and after having spotted five penguins, we each went our way.
The next morning I arrived bright and early for my first workshop day of my very first conference: IJCAI 2017. At registration, I bumped into the stranger I had met the night before. I was quite surprised. A little later, I felt rather embarrassed when Fahiem walked on stage for the official opening, and I realised that last night’s stranger was actually the conference chair of IJCAI that year.
It turned out that Fahiem and I had quite some research interests in common. At the time, I was working in a department where people with whom I had research interests in common were pretty scarce. We kept in touch, one thing led to another, and Fahiem invited me to spend a few months with his group at the University of Toronto.
While at UofT, Fahiem taught me much about how to think about algorithms and complexity. Each meeting we had ended with me feeling a bit dumb because I was made more aware of how much I really didn’t know or understand yet. At the same time, I felt honoured that he took the time to talk to me and teach me. I loved the reading groups that he organised. They changed the way in which I read papers from the SAT community. I am a lot more critical now, understand better what is good and what is just air (even though I clearly still have a lot to learn).
I am so grateful for the time I got to spend with him and with his group in Toronto.
Fahiem was so generous. With his time, with his knowledge, with his humour. He will be missed so much by everyone who knew him.
]]>We decided to submit an update of Jeroen’s work on caching for model counters, based on his master thesis work. It was accepted, so last week Jeroen presented our work.
While I was naturally happy to see our work presented, I also very much enjoyed the rest of the program. It has been a while since I’ve seen so many interesting talks in a row. I particularly liked the contributions from Tuukka Korhonen and Lucas Bang.
Jeroen’s slides can be found here, and a recording of his presentation is also available online.
Thanks to the organisers! I am looking forward to attending next year’s edition!
]]>After his presentation, Jeroen defended his work to me and his other advisers, Prof. Dr. Holger Hoos and Prof. Dr. Siegfried Nijssen. He handled our questions as well as those from the audience quite well, impressing us with his knowledge of the #P complexity class.
A preliminary version of this work was accepted at the Workshop on Counting and Sampling 2021.
]]>If I’m perfectly honest, I was quite insecure about reviewing for AAAI. It was my very first time as a PC member, and I was not quite sure that I was knowledgeable enough to be able to be sufficiently critical of the contents of a paper and to recognise flaws, especially those related to novelty and obscure theorems that have pages-long proofs.
My colleagues encouraged me, by reminding me that I had read and graded roughly 150 student papers/reports/theses already. For many of those, the feedback I provided led to new and improved versions, and in some cases even to scientific publications. All in all, my colleagues reckoned I was ready.
I then simply did the best I could. I did not receive any feedback from the AAAI-21 Program Committee, so I can only guess why they picked me among the thirteen outstanding PC members out of all those thousands. Just in case you are reading this looking for tips on how to be a good reviewer, here are my recommendations for reviewing anything, be it a scientific paper submitted for publication, or a student report for an (under)graduate course.
All good feedback is actionable. For example: instead of saying “Section 4 is unclear”, say “I would find it easier to understand Section 4 if concept X was formally defined.” Note that this also reflects the fact that your feedback is your opinion, not fact. It also indicates what made Section 4 unclear to you, and what would improve Section 4’s clarity for you.
If authors have the opportunity to write a rebuttal, it is helpful if you number your points of critique and your questions. They can then easily use the numbers to refer to and respond to these critiques and questions, saving valuable rebuttal space. Conversely, you can easily find their responses and check if they have addressed all your points.
It is also helpful if you indicate, at least to your fellow reviewers, but maybe also to the authors, what you took into account in your final assessment of the work. Especially if a paper ultimately is rejected, it is good to make explicit that this was, e.g., because the quality of the experiments was lacking and not because there were too many typos.
I firmly believe that if I help you thrive, it makes me thrive, too. I want my students to do well, because a) their success reflects well on me and on my university, and b) it’s simply much more fun if they succeed than if they fail.
Similarly, I want the papers at the conferences where I (try to) publish my work to be of high quality. I genuinely tried to come up with feedback that would make the next version of the work I was reviewing better, regardless of whether it would be accepted or not.
Naturally, it does not hurt to also compliment the authors on anything they did particularly well. We all struggle to get our work published, so encouragement and appreciation are always welcome. Furthermore, you will also benefit if people learn what makes a paper interesting or pleasant to read for you, so you can also compliment authors for thoroughly selfish reasons!
Reviewing papers is a great opportunity to learn, which can be a great motivator to put some serious effort into it. You get a glimpse into the latest advances in your (or a related) field, which I trust you find just as exciting as I do.
On top of that, you can also learn a lot about other people’s writing process. While reviewing, ask yourself if the language and the structure in the paper works for you. Are there helpful analogies? Are the acronyms clearly explained? Is there a particular framing that you find compelling, or quite the opposite? Take notes and learn!
Especially if it is your first time reviewing, you may not be familiar enough with all the material to assess its quality. Try to learn as you go, but also don’t overwork yourself. If you have reached the end of your knowledge, just be open about that to your fellow reviewers and area chair. Maybe they can fill you in, or at least take into account that your knowledge is lacking.
You may also not fully understand the process or the conventions, especially because not all of them are always clearly communicated. It is absolutely okay to indicate to the other reviewers and the area chair that you are unsure if a certain paper is out of scope or if certain conditions (like anonymity) are not met, as long as you indicate how and where you have looked for the answer and what the result was of that search. You may not be the only one who is confused, but you also do not want to cause a paper to be rejected just because something was not communicated clearly.
Out of all the people they could have asked to review this paper, they chose you. You are part of the community. You have knowledge. You have ideas. You have a vision. You are qualified to give it your best shot. You have a lot to offer, so show it to them.
You are not the only reviewer and it is okay to disagree with the others. If, during the discussion, you find that somebody makes a good argument, let yourself be pursuaded and admit that you have changed your mind. Being able to change your mind when presented with compelling new information is a strength, not a weakness. If the new information is not compelling to you, it is also perfectly fine to not change your mind. You are entitled to your opinion.
This is really the summary of the points above. It is a phrase that prof. Holger Hoos (my PI) likes to use, and I find it very helpful. We also use it in our group, when we give each other feedback on our work. With any feedback I give, I ask myself if I am indeed being critical enough, and not slacking off or shying away from uncomfortable truths. Additionally, I ask if the feedback I give is constructive and respectful, and indeed likely to help the other person succeed, instead of demotivating them or dragging them down.
In these days of crazy reviewing load, we are grateful to have exceptional colleagues in our community: @thserra @aldlatour @tchakra2 @kr_t https://t.co/L3mO1Re7Hn
— Kuldeep S. Meel (@ksmeel) February 4, 2021
The award came as a complete surprise to me, if not to others. As one of my colleagues put it: “You cannot do anything without winning an award for it, can you?” Clearly that is untrue, but I am not going to deny that I wasn’t a bit flattered to hear that. Ultimately, it is nice to be appreciated by your community.
Congratulations to @aldlatour for being recognized as an outstanding reviewer at #AAAI2021! Well done! https://t.co/MQOK1sYPBl
— Siegfried Nijssen (@SGRNijssen) February 4, 2021
]]>While I was away from social media, I missed this great achievement by Anna Latour @aldlatour :
— Behrouz Babaki (@BehrouzBabaki) August 26, 2021
She was one of the 13 outstanding reviewers (out of thousands) in AAAI'21. https://t.co/MlpAaz4MrZ