[MUSIC] Hello everyone. My name is Pimlapas Leekitcharoenphon. I'm a postdoc from Division for Epidemiology and Microbial Genomics at DTU Food. I'm going to be presenting you about multipurpose detection of genetic markers, MyDbFinder tool description, and application. So right now whole genome sequencing is becoming a more practical, rapid, and less expensive form of typing. And the tools intended for quick analysis of whole genome sequencing data are required to accommodate that researchers may have interest in particular sets of genes for which no database are available. So we develop our special versions of base finder called MyDatabseFinder, or MyDBFinder. So the MyDbFinder is a web friendly interface and freely accessible that allow the user can actually generate their own database, content engines of their own interest that they wish to search for, and the tool MyDbFinder actually contains your own database, a user database. And the user actually also uploaded their unknown genomes in either raw reads or assembled genomes to the database. And the database, you take the unknown genomes and search against the user database using BLAST to identify the best matching gene from the user database. So the tricky part of my DB is to make your own database. So how you can make your own database? The database for the web for the my DB finder to be made in text editor. So we recommend you to use Notepad or TextEdit, both programs you can actually download it, it's freely available. The database must contain the DNA sequences in FASTA format. So what is a FASTA format? The FASTA format is standard format for storing the DNA sequence data. So it actually start by the header. And a header begin with the greater than sign, is has to begin with the greater than sign and any ID. The next line is a sequencing data ACTG, and in it you have multiple sequences then you can continue with the other sequence here. Sign greater than, ID and sequence data. And MyDbFinder only shows the first word of the header as output genes. For example, if your unknown sequence matched to the sequence number two, it will only show sequence number two, if break when if find the space. So we recommend you to use different gene or sequence names without any space in it. For example, if you have space between the name of the sequence, you put underscore instead of this space. Then the program view shows all the information of the header in the output page. And yes here, this is the link, the web link to the MyDbFinder. And if the web interface that when you click it in and you can see this one. So first of all, we all use start by browsing your own unknown sequence, here you upload your unknown sequence to the database, and the next option is make your own database. So you upload your database that you use text editor to make edits before. So you browse your own database, in this case it's a wiring of that database. So when you browse your database and you have to choose some of the options that we provided here. So the first one is the percent identity. The percent identity is the minimum percentage of nucleotides that are identical between the sequence from your unknown sample and the genes in your database. Another option as a cutoff is the minimum length. The minimum length is a percentage of the total gene length in the database that match to the sequence in your unknown genome. So the last option that you can choose is to tie up your wreath that you just browse here. If your wreath, or the data that you browse here, is assembled genomes then you click assembled genome. And remember the assembled genome has to be in FASTA format like this. And if your sequence that you browse, your unknown genome is the raw reads, then it had to be in FASTQ format. The Fastq format is similar to FASTA. Fastq is the FASTA plus quality score, that's why it's called Fastq. So the two line is the sequencing data, and the two last line is the go for sequencing. Yes, and you have to choose if your data or raw reads, you have to choose is a single and reads or pair and reads and you have to choose technology or the sequencing platform that you actually use. Okay, once you browse your unknown genome, you'll browse to your database that you make by your own. You choose all the options here and using the correct type of your unknown sample then you're ready to submit your job by clicking at submit. Now it lead you to another web page. It's telling you that your job is being processed. And we have another option that you can fill it in, your email and click notify me via email. So once your job is done, the program will send you the email with the output link in the email. So you can add to the start doing another submission by closing this out or opening another window and start a new submission. You don't need to wait until this is done and start another one. Here is an output from the program that you might get from MyDbFinder. So it tells you the genes in the database that match to your unknown sample with an identity and security and the HSP length, so what they are, so the query length is the length of the best matching gene in the database. So if accelerated length of the genes in the database that match to your unknown sample. The HSP is the length of the alignment between the best matching gene in the database and the sequence in your unknown sample, so HSP is alignment between the gene in the database and the sequence from your unknown sample. So, if the query length and HSP they are equal, like this. It means the alignment between the gene from this unknown sample and a gene in the database, the alignment, they actually cover an entire length of a gene in a database. That indicates a perfect match, actually. So it also tells you the position. The positions of the sequence in your unknown sample that match to the gene in the database. And from the output, you can actually see which was an identity and the minimum length that you actually choose before submitting it and the input file or the and the example that you actually submitted. And if you want to see the detail of your alignment between your unknown and the gene in the database, you can click attendant output here and what you can see, you can see the detail of all the alignment. And if you want to get the output like this in the text format, as a text file, you can click results as text. You can get the same result, but in a text file. And if you want to get the sequence of the genes in the database, the base matching to the sequence in your unknown sample, you can actually click to one of the bottle here, and you can get the sequence data of the gene in the database that match to your unknown sample. And if you have any problems or any technical difficulties using our tool, you can actually click on technical problems and email us your problem. And thank you very much for watching. [MUSIC]