Crop Bioinformatics Adelaide (CroBiAd)
Crop Bioinformatics Adelaide (CroBiAd) provide bioinformatics expertise to the agriculture industry through the development of novel bioinformatics and biometrics methods for wheat genetics and breeding.
Our bioinformatics activities include:
- Research - in silico experimentation, such as development of novel methods, tools and pipelines.
- Consultancy – recommendations in experimental design for projects such as RNA-Seq and various DNA sequencing.
- Bioinformatics service provision - Data analysis solutions and hands-on bioinformatics workshops in a fee for service model.
Based within the University's School of Agriculture, Food and Wine, we are the South Australian node of the EMBL-Australia Bioinformatics Resource (EMBL-ABR).
Biology as a data science
Biology is becoming ever more data driven. This is in part due to technological advances in DNA sequencing which means that many labs can now generate DNA sequence data at low cost. This has brought new opportunities to the field of crop biology but also carries its own challenges due to the size and complexity of this big data.
Bioinformatics is inherently a multidisciplinary field which intersects biology, computer science and statistics. As such, our group is naturally multidisciplinary too, with different career paths and different areas of expertise.
Genomics for wheat breeding
Our goals are to develop novel bioinformatics and biometrics methods for wheat genetics and breeding.
The size and complexity of the wheat genome exposes some unique problems for bioinformatics. These require the development of novel approaches for data handling and analyses, including:
- Wheat genomics and transcriptomics
- Marker development
- Software development
- Pipeline and algorithm development
Our research is aligned to Program 5 of the Wheat Hub - the ARC Industrial Transformation Research Hub for Wheat in a Hot and Dry Climate.
Our group has strong collaborations with both local biology-focused groups as well as national and international industry partners. This has been recognised by an Emerging Industry Research Partnership Award.
Key industry and international collaborations
- DuPont Pioneer - A 10 years collaboration
- ARC Industrial Transformation Research Hub for Wheat in a Hot and Dry Climate (Wheat Hub)
- Biogemma
- International Wheat Genome Sequencing Consortium (IWGSC)
- Wheat Information System Expert Working group
- Ute Baumann - Group Leader
- Elena Kalashyan - Programmer
- Nick Warnock - Bioinformatician
- Paul Eckermann - Biometrics Officer
- Melissa Garcia - Research Fellow
- Charity Chidzanga - PhD student
- Amritha Amalraj - PhD student (Visiting group member)
-
Alumni
- Andreas Schreiber - Research Fellow
- Andy Timmins - Postdoctoral Fellow in Bioinformatics
- Angus Wallace - Postdoctoral Fellow in Bioinformatics
- Timo Tiirikka - Bioinformatics Officer
- John Toubia - Bioinformatics Officer
- Virginie Perlo - Bioinformatics Officer
- Patrick Laffy - Bioinformatics Officer
- Joseph Sclauzero - Summer scholar, honours student
- Ngoc Vo - Summer scholar
- Anthony Clissold - Summer scholar
- Daniel Menadue - Summer scholar
- Jia Truong - Summer scholar
-
Training
Our group has expertise in the development and delivery of hands-on bioinformatics workshops.
-
Software - public
CroBiAd develops a wide range of software, predominantly aimed at biology end-users in order to democratise crop genomic resources.
Because of the nature of our funding and research, much of our code is developed internally and maintained using git repositories on a local GitLab install. As projects approach publication, these resources are published to our public GitHub account.
POTAGE
POPSEQ Ordered Triticum aestivum Gene Expression (POTAGE) is a visualisation tool for speeding up gene discovery in wheat.
DAWN: Diversity Among Wheat geNomes
DAWN provides access to a variety of public wheat data sets in the context of the International Wheat Genome Sequencing Consortium’s (IWGSC) RefSeq v1.0 genome assembly[1]. This is achieved by pre-processing the data and making it available through a JBrowse genome browser.
Resources
- JBrowse
- Coordinate converter - convert coordinates from parts to/from whole pseudomolecules
- Open data at figshare
Help
DAWN source data sets
The following data sets are available through DAWN:
- Reference sequence
- IWGSC RefSeq v1.0 genome assembly[1].
- Gene annotations
- IWGSC v1.0
- IWGSC v1.1
- Markers (as part of IWGSC v1.0 annotation)
- DArT
- EST
- MAS
- SNP platforms
- SSR
- Whole genome shotgun data
- Transcriptomic data
- Exome data
Citing DAWN
- Watson-Haigh, N.S., Suchecki, R., Kalashyan, E., Garcia, M. & Baumann, U. DAWN: a resource for yielding insights into the diversity among wheat genomes. BMC Genomics 2018 19:941.
External resources
You may also find the following wheat related resources useful:
-
Software - internal
Access type Resource Description School of Agriculture, Food & Wine only A BLAST server containing the IWGSC RefSeq v1.0 assembly. Internal network only coching bwpf Bread Wheat Promoter Finder blast A BLAST server containing a mix of published and unpublished data sets POTAGE A POTAGE server containing published and unpublished data sets fetch Simple sequence retrieval for selected genome assemblies dev-DAWN Diversity Among Wheat geNomes blast-dev GitLab Project/code/software development resources bareos Backup server Genome Ribbon Web server for visualising structural variants generated from PacBio alignments. Modified to load decompressed CSI BAM index files. -
Scientific computing infrastructure
Scientific computing infrastructure
Research involving large, complex, polyploid genomes such as that of wheat often requires computers with large amounts of RAM.
However, such infrastructure is not normally available from typical high-performance computing (HPC) providers. As such, we operate our own scientific computing infrastructure, which includes several large memory nodes.
Current infrastructure
- A Slurm based cluster consisting of 2 compute nodes with the following specs:
- 72 CPUs (Intel Xeon E5-2699v3 @ 2.30GHz)
- 755 GB RAM
- 880 GB fast local storage (2 x SSD’s in RAID0)
- Access to 120 TB of clustered storage
- A stand-alone compute node with the following specs:
- 64 CPUs (Intel Xeon E7-4830 @ 2.13GHz)
- 512 GB RAM
- 1.7 TB fast local storage
- Access to 120 TB of clustered storage
Access to resources made available through:
- NeCTAR Research Cloud (2 allocations: one for training and one for the group)
- 430 CPUs
- 1.7 TB RAM
- 5 TB of object storage
- University of Adelaide’s phoenix HPC
- A Slurm based cluster consisting of 2 compute nodes with the following specs: