Postagens

A quick talk on "Working with Bioinformatics"

Folks, a quick talk about working with bioinformatics. Bioinformatics job profiles is a gradient. In one extreme, there's the developers of bioinformatics tools. In the other extreme, there are the users of bioinformatics tools. In both cases you need to have a good knowledge of biology, specially of genetics and biochemistry. It's important to note that the "users" profile is mainly related to the use of webservers and desktop software. But keep in mind, it's a gradient, so there are multiple and varied mixes of these two extremes. Despite all the profiles are devoted to answer a biological question at some extent, "user" profiles will be directly involved at it, while "developer" profiles will be aimed at proposing solutions that help to solve many instances of a same problem class. Because of that, the best way (in my opinion) to start on bioinformatics and find your way is at the middle of the gradient, the so called "bioinformatics sc...

The “hurting” Bioinformatics message I have seen

 Is this one: It's 2024. Most of #bioinformatics is yet to come and the best we can offer future generations of scientists is idiosyncratic R-scripts, massive Nextflow pipelines and file formats such as FastQ? We have to do better. ( Fabian Klötzl )           https://genomic.social/@kloetzl/111772016843202526 And in the end of the day… it kinda makes sense. Why does it hurts? Roughly speaking, R is the “default language” of bioinformatics. Even I, a pythonic headbanger, use R in bioinformatics research. Nextflow is also getting this “defaultness” too. For those who don’t know, it is a technology that you can use to build (among other ones) data processing pipelines, and thus automate processes. Bioinformatics people have been using it a lot. “file formats such as FastQ” have been (again) the “default format files” for years, maybe decades. Personal interpretation (I may be wrong, please confirm it in the comments section): the message is that ...

Reproducibility with Python - Notes and tips

 I understand reproducibility as two situations: When you (yes, you) get to run (in your (yes, your) own machine) the code produced by “someone else” and obtain the exact same results that “someone else” obtained in her/his/their machine. When “someone else” gets to run in her/his/their own machine the code produced by you (yes, you) and obtains the exact same results that you obtained in your (yes, your) machine. How can we achieve (or get closer to) reproducibility to our python code? Well, we must ensure that “someone else” will run our code using, at least, the same python version we were using as well as the same versions of the packages we were also using. How can we “help someone else” with this? I recommend using in your projects one of these two solutions: Poetry Conda (either Anaconda either miniconda ) Both work with the concept of virtual environment, a “sandbox” in which you can specify, use and “freeze” the “exact state” of package versions (and in cas...

Learn foreign languages

It has been about a month and a half that I have moved from my home country to another one. One of the biggest lessons I learned was: learn to speak the languagr of that country. Moving to another country brings many challenges and you inevitably get in trouble, mainly due to culture shocks. However, when you don't know the language of the country you're moving, the weight of all of this doubles, and everything gets harder. Yes, knowing English helps, but it is very far to be really useful, tough is not worthless. On average, most of the people will know English in a basic level such that communication can be made, however the number of people that knows absolutely nothing of English is way higher than you expect, and the probability of finding them is very high. That's why you definitely need to learn the language of the country you're moving to. Believe me, your mental health will thank you in a way you can't imagine.

Quick notes on Language Models

A study submitted in July 2023 reported that GPT is getting “dumber”, with decreased performance on math calculus and visual reasoning. This caused a bit fuzz on the internet, but anyway, I’d like to make some notes here. An important note about theses notes (jokes and puns aside): they all come from head and from my experience and world knowledge. I did not consult any reference to make them (except for the links, of course), so be aware of this. Here are the notes: GPTs and Language Models based on Transformers learn from text data (as you may know) and represent this acquired knowledge as ontologies (roughly speaking, a network of interconnected concepts). Thus, prompting a GPT or any other language model is basically “querying an ontology”, of course, in a very smart and practical way. Hence, a Language Model actually does not do calculus per se, but queries the “ontology” to try to figure the result. Hence (again), we shouldn’t except that much of GPT in the “math side of the ...

A tour in the “Museu da Rampa”

  A couple of weeks ago, I made a tour at the “Museu da Rampa” ( Ramp Museum, in free translation from Portuguese), located in Natal, Rio Grande do Norte, Brazil. This museum tells stories about the coming of US Army to Natal during Second World War, as well as the visit of then US president Winston Churchill to Natal and his meeting with then Brazilian president Getúlio Vargas. For those who do not know, Natal was a strategic spot for Americans during the 2nd World War, because it was faster for the military planes to get to Europe departing from the city. Because of that, Natal received the slang of O trampolim da vitória ( The trampoline of the victory , in free translation). Vargas and Churchill made, then, an agreement to use Natal as a spot for US army. Such agreement marked the coming of US soliders to Natal and its neighbor city, Parnamirim, where military bases were located. The presence of US military in Natal and Parnamirim deeply marked Natal’s and Parnamirim’s histo...

My Chess Openings Tour

I’ve performed a “Chess Opening Tour” from March 2023 to June 2023. As a beginner in Chess, I wanted to build a little repertoire in openings to at least know how to start a chess game in a comfortable way. The idea was to practice a single opening per week, alternating one week with an opening for whites and one week with an opening for blacks. I’ve asked ChatGPT to build an openings list for me to follow (one for whites, and another for blacks). After some adjustments that occurred even during the tour, the final list for whites were: Italian Game (1.e4 e5 2.Nf3 Nc6 3.Bc4) Reti Opening (1.Nf3) Catalan Opening (1.d4 Nf6 2.c4 e6 3.g3) Scotch Game (1.e4 e5 2.Nf3 Nc6 3.d4) Vienna Game (1.e4 e5 2.Nc3) Bird Opening (1.f4) English opening (1.c4) Ponziani Opening (1. e4 e5 2. Nf3 Nc6 3. c3) The list for Blacks were: French Defense (1.e4 e6) Caro-Kann Defense (1.e4 c6) Alekhine Defense (1.e4 Nf6) Modern Defense (1.e4 g6) Pirc Defense (1.e4 d6 2.d4 Nf6 3.Nc3 g6) King's ...