Skip to end of metadata
Go to start of metadata

Why Bring Galaxy to Iowa?

We started out pilot of Galaxy as a means of delivering open access and automation to a genetic analysis for deafness as part of a grant between MORL and CBCB. Through this effort we found value in the usability and accessibility features that Galaxy provides. By bringing Galaxy to Iowa, it allows us to provide the following benefits:

  • No user data quota limits
  • Local and closed storage and transfer of genetic information
  • The ability to customize all things Galaxy (and extend to provide custom features that make sense for Iowa)
  • Control over tool versions and which tools are exposed.

Will there be data storage caps similar to the public Galaxy Server?

Nope. At this time there is no quotas configured for our local Galaxy deployment. We will be configuring automatic clean up scripts which will look for data that has not been accessed in over 30 days. No data will be automatically removed with out letting you know. All users will share a common storage space for Galaxy which will initially be 36 TB in size.

When you delete a dataset in Galaxy it is not permanently deleted until we run our purge and cleanup scripts which look at the last date of access (like the last time it was read). You can recover any Galaxy deleted dataset until it has been officially purged.

Can you help us out? By deleting datasets through Galaxy you help mark data that is safe for us to purge and remove after some length of time. This helps us know we are not removing something critical. Thanks.

Can I access Galaxy from anywhere?

Only if you use UI Anywhere tunneling. Right now we only have Galaxy open on the UofI Campus Network. If you want to access Galaxy off campus, you still can - but need to use VPN tunneling. To setup and use Campus ITS's UIAnywhere vpn tunneling, please see the following information:

How many jobs can I run?

For our alpha deployment, we have capped Galaxy at 25 concurrent jobs shared across all galaxy users. However, when we move to our production box there will be no limits to how many concurrent jobs galaxy can run. Our production server will dispatch to two queues on Helium. Our IIHG investor queue, and Helium's ALL.Q. Jobs dispatched to ALL.Q do have the possibility of being moved (killed and restarted) if an investor needs to run a job on their compute node. When Sun Grid Engine (SGE) has to relocate jobs, it will try to do so intelligently and choose jobs which have been running for the least amount of time to stop and restart.

How do create a local Galaxy account?

Good news: no paper work required! Your Galaxy account and all necessary server accounts are automatically created for you when you first logon to the Galaxy user interface. Auto server account creation is just one example of some of the customizations we are making to Galaxy code base.

Must access from the campus network

Galaxy is only available from the campus network. Need to access it from anywhere? Use UIAnywhere, U of I ITS's vpn tunneling solution.

Use Galaxy:

Can I FTP data to Galaxy?

You can absolutely sFTP (SSH File Transfer Protocol) data into our Galaxy server. We do not expose FTP for security reasons (your password is sent over the network in plain text - yikes!). Once you have logged into Galaxy for the first time, all necessary accounts are automatically created for you. This allows you to sftp data onto the Galaxy server for import into Galaxy using the Get Data -> Upload File tool in Galaxy.

Handy Tip

You can upload compressed data (zip, gzip, bz, bz2) and Galaxy will automatically decompress the data for you during the file upload into the Galaxy storage.

How to get connected over SFTP

Must tranfer data from the campus network

sftp is only available from the campus network. Need to access it from anywhere? Use UIAnywhere, U of I ITS's vpn tunneling solution.

  • Server or Host to sftp data to:
  • Port to use: 22021
  • user/password: hawkid & passwd

Once connected you must change into the subdirectory with your hawkid to put data (this is the only directory with write permissions).

Alpha Deployment SFTP Consideration

On this alpha deployment, we have limited sftp space (about 300 GB) that must be shared between all alpha users. If you upload data over sftp, please import it immediately into Galaxy to free up space for other alpha users. If you hit disk storage errors, it will likely be due to our sftp space being full. Please wait a couple hours and try again.

This problem will be eliminated when we move to our production server

Example on how to do this via the command line prompt:

Example setting up a connection from Fetch:

Once you have put data into your SFTP directory, it will appear in Galaxy's Get Data -> Upload File wizard:

Once data is successfully imported into Galaxy, it is automatically removed from your sftp upload directory.

I see a "The site's security certificate is not trusted!" warning when I access the alpha Galaxy deployment server ... what's that all about?

We mentioned our Galaxy deployment was in alpha phase, right? This is a good example on why it is considered an alpha deployment. If you see this message it is safe to click on "Proceed". We have temporarily generated and self signed our security certificates for the test server that Galaxy is hosted on. Once we have our production server running, we will have an official University of Iowa signed certificate and you will no longer experience this warning.

How come I don't see the same tools exposed locally as on the Public Galaxy Server?

We have decided to only expose tools in Galaxy as requested/needed. This helps us stage how we test and support our local Galaxy deployment. Our local Galaxy is capable of everything the public Galaxy server is. Additionally, Galaxy supports most any tool to be plugged in and exposed in the user interface and ran. If you have a feature request for Galaxy, please post it to our galaxy-users listserv. If you see a feature request get posted to the galaxy-users list and would also find it useful, please respond with your support. Feature requests will be prioritized based on community feedback and support.

Is my data backed up?

No, sorry. Galaxy leverages high speed data storage for its computations. This space is not backed up. Any data which is critical is highly recommended to be downloaded from Galaxy into a permanent, archived storage.

Where does the data that is analyzed in Galaxy reside?

All data that is imported into Galaxy resides on the University of Iowa High Performance Computing Cluster, Helium. The data will be accessible on Helium's high speed storage network, Lustre. The data physically resides in HPC's Helium Server room in the University of Iowa's Linquest Center.

I've hit a problem, now what?

Sorry to hear this. Please see our Get Help section. Thanks!

  • No labels