class: center, middle name:opening # [NeuroData](http://neurodata.io): ### Enabling Terascale Neuroscience for Everyone
.center[ presented by, Joshua T. Vogelstein
{[bme](http://www.bme.jhu.edu/),[icm](http://icm.jhu.edu/),[cis](http://cis.jhu.edu/),[idies](http://idies.jhu.edu/),kavli,[cs](http://engineering.jhu.edu/computer-science/), [ams](http://engineering.jhu.edu/ams/), [neuro](http://neuroscience.jhu.edu/)}@[jhu](https://www.jhu.edu/)
please ask questions: [support@neurodata.io](mailto:support at neurodata dot io)!
these slides:
] --- # Big Scientific Data
.pull-left[ - molecular genomics (1D) ] .pull-right[
] --- # Big Scientific Data
.pull-left[ - molecular genomics (1D) - cosmology (2D) ] .pull-right[
] --- # Big Scientific Data
.pull-left[ - molecular genomics (1D) - cosmology (2D) - Neuroscience! (3D+) ] .pull-right[
] --- class: center, middle
--- ## Challenge #1 .center[
] Goal: Enable generating and confirming anatomical hypothesis using simple tools, in a reproducible and extensible fashion. --- ## Challenge #2 .center[
Goal: What is the spatial distribution of synapses? ] --- class: center ## [NeuroData](http://neurodata.io) 6 steps to discovery
--- name: applications layout: false class: middle, center .center[ # Methods ] --- layout: false class: middle, center
--- name:store ### Store
- Goal: eliminate read/write bottlenecks - Challenge: accessing a file takes time, data are big - Action: .y[ndbackend] infrastructure .center[
] .cbar[ [@kunal](https://github.com/kunallillaney) ] --- layout: false class: left ### Store: Image Data Model
- Goal: general enough for many data types - Challenge: simple enough to understand - Action: .y[nddatamodel]
.center[
] .cbar[ [@kunal](https://github.com/kunallillaney) ] --- ### Store: Write Volumes
- Goal: write faster than acquire - Challenge: data stored as images, streamed quickly - Action: .y[ndstore] spatial database
.center[
] .cbar[ [@kunal](https://github.com/kunallillaney): [
](https://github.com/neurodata/ndstore) | [
](http://docs.neurodata.io/ndstore/sphinx/console.html) | [
](https://github.com/neurodata/ndstore/issues/new) ] --- ### Store: Read Volumes
- Goal: read volumes for machine vision - Challenge: data stored as images - Action: .y[ndstore] spatial database
.center[
] .cbar[ [@kunal](https://github.com/kunallillaney): [
](https://github.com/neurodata/ndstore) | [
](http://docs.neurodata.io/ndstore/sphinx/console.html) | [
](https://github.com/neurodata/ndstore/issues/new) ] --- ### Store: Read Tiles
- Goal: visualize @ video rates for mobile - Challenge: data stored in cubes - Action: .y[ndtilecache] content distribution network
.center[
] .cbar[ [@kunal](https://github.com/kunallillaney): [
](https://github.com/neurodata/ndtilecache) | [
](http://docs.neurodata.io/ndstore/api/tile_api.html) | [
](https://github.com/neurodata/ndtilecache/issues/new) ] --- name: ramondb layout: false ### Store: Shape Data Model
- Goal: general enough for many data types - Challenge: simple enough to understand - Action: .y[ndramon] .center[
] .cbar[ [@willgray](https://github.com/willgray): [
](https://github.com/neurodata/ndstore/tree/microns/ramon) | [
](http://docs.neurodata.io/nddocs/ndprocess/ramon.html) | [
](https://github.com/neurodata/ndio/issues/new) ] --- ### Store: Read Shapes
- Goal: enable easy semantic queries - Challenge: no standard "structured vocabulary" - Action: .y[ramondb] stores object metadata
.center[
]
.cbar[ [@willgray](https://github.com/willgray): [
](https://github.com/neurodata/ndstore/tree/microns/ramon) | [
](http://docs.neurodata.io/nddocs/ndprocess/ramon.html) | [
](https://github.com/neurodata/ndio/issues/new) ] --- ### Store: Write Shapes
- Goal: write annotations with object-level metadata fast - Challenge: many small files - Action: .y[ndblaze] assembles transparently in RAM .center[
] .cbar[ [@kunal](https://github.com/kunallillaney): [
](https://github.com/neurodata/ndblaze) | [
](https://github.com/neurodata/ndblaze/issues/new) ] --- name: explore layout: false ### Explore: Images
- Goal: google maps like interface to explore images - Challenge: mobile speed - Action: .y[ndviz] navigation .center[
] .cbar[ [@alexbaden](https://github.com/alexbaden): [
](http://ix.neurodata.io) | [
](https://github.com/neurodata/NeuroDataViz) | [
](https://github.com/neurodata/NeuroDataViz/issues) ] --- ### Explore: Histograms
- Goal: explore histograms of very large volumes - Challenge: cannot compute histograms in memory - Action: .y[ndhist] .center[
] .cbar[ [@alexbaden](https://github.com/alexbaden): [
](http://hx.neurodata.io) | [
](https://github.com/neurodata/histogram-explorer) | [
](https://github.com/neurodata/histogram-explorer/issues) ] --- ### Explore: Shapes
- Goal: dynamically overlay annotations - Challenge: interactivity - Action: .y[ndviz] add channel + query
.center[
] .cbar[ [@alexbaden](https://github.com/alexbaden): [
](http://ix.neurodata.io) | [
](https://github.com/neurodata/NeuroDataViz) | [
](https://github.com/neurodata/NeuroDataViz/issues) ] --- layout: true .bbar[ [Intro](#intro) | [Methods](#methods) | [Results](#applications) | [Discussion](#disc)] --- name: 2d layout: false ### Analyze: 2D Correction
- Goal: auto remove 2D histogram artifacts - Challenge: signal & noise overlap, data are big - Action: .y[distributed multi-grid] (dmg)
.center[
]
.cbar[ [@misha](https://github.com/mkazhdan/): [
](https://github.com/mkazhdan/DMG) | [
](http://www.cs.jhu.edu/~misha/MyPapers/ToG10.pdf) | [
](https://github.com/mkazhdan/DMG/issues/new) ] --- name: 3d ### Analyze: 3D Correction
- Goal: auto remove 3D histogram artifacts - Challenge: data are bigger - Action: .y[gradient domain fusion] (gdf)
.center[
]
.cbar[ [@misha](https://github.com/mkazhdan/): [
](http://www.cs.jhu.edu/~misha/Code/GradientDomainFusion/) | [
](http://arxiv.org/pdf/1506.02079v1.pdf) | [
](https://github.com/mkazhdan/DMG/issues/new) ] --- name:ndparse ### Analyze: Object Detection
- Goal: scalable object detection - Challenge: requires labeling, training, & distributed computing - Action: .y[ndparse] .center[
]
.cbar[ [@willgray](https://github.com/willgray): [
](https://github.com/neurodata/ndparse) | [
](http://docs.neurodata.io/nddocs/ndparse/mbcd.html) | [
](https://github.com/neurodata/ndparse/issues/new) ] --- layout: true .bbar[ [Intro](#intro) | [Methods](#methods) | _[Results](#applications)_ | [Discussion](#disc)] --- name: applications layout: false class: middle, center .center[ # Results ] --- name: r_im ## Image Datasets .center[
] - 10+ public datasets - EM, AT, Ophys, XCT - 100+ teravoxels - All "reference" datasets .cbar[ [
Projects](http://neurodata.io/projects) ] --- name: r_anno ## Annotation Datasets .center[
] - 5 different publications with volumetric annotations - No skeleton annotations yet - Volumetric required for training machine vision - Annotations cross spatiotemporal scale: nano, micro, time-series --- name: r_cs1 ## Case Study #1 #### Reproducible and Extensible Big Data Neuroscience
Statistical claims 1. [axons & dendrites](https://github.com/neurodata/kasthuri2015/blob/master/claims/claim02_axons_and_dendrites.ipynb) 1. [synapses](https://github.com/neurodata/kasthuri2015/blob/master/claims/claim3_synapses.ipynb) 1. [mitochondria](https://github.com/neurodata/kasthuri2015/blob/master/claims/claim4_mitochondria.ipynb) 1. [spines](https://github.com/neurodata/kasthuri2015/blob/master/claims/claim5_spines.ipynb) 1. [vesicles](https://github.com/neurodata/kasthuri2015/blob/master/claims/claim6_vesicles.ipynb) 1. [connectivity](https://github.com/neurodata/kasthuri2015/blob/master/claims/claim9_make_graph.ipynb) --- ## Case Study #1 #### Reproducible and Extensible Big Data Neuroscience
[axons and dendrites](https://github.com/neurodata/kasthuri2015/blob/master/claims/claim2_axons_and_dendrites.ipynb) 1. How many different neurons? 1. How many dendrites? 1. What fraction are spiny? 1. How many unmyelinated axons? 1. What fraction are excitatory? 1. How many spines? 1. How many axon branches? 1. What fraction of cellular volume is neuron? 1. What fraction of volume is cells? 1. Relative volume of axon vs. dendrite? ... .cbar[ [
](http://docs.neurodata.io/nddocs/ndpaper2016/case_study1.html) Data: Kasthuri et al. (Cell) 2015 ] --- name: r_cs2 ## Case Study #2 #### Is synapse distribution uniform in space?
.center[
] Detecting ~11.6 million synapses from the [bock11](http://openconnecto.me/bock11) dataset ??? timings --- name: r_anal class: center, middle
.cbar[ [
](http://docs.neurodata.io/nddocs/ndpaper2016/case_study2.html) Data: Bock et al. (Nature) 2011 ] --- layout: true .bbar[ [Intro](#intro) | [Methods](#methods) | [Results](#applications) | _[Discussion](#disc)_] --- name: disc class: middle, center layout: false # Discussion --- ### Open Science Contributions - Reference EM and other datasets - Reference annotations (crucial for training machine learning) - Reference pipelines for operating on such data - Web-services for data access - Comprehensive cloud computing platform --
### Implications: it is now easier to - Answer questions that require scale - Engage complementary expertise - Reproduce and extend results -- ### Coming soon - [ndio](http://docs.neurodata.io/nddocs/ndio/) python API - More public datasets: CLARITY, ExM, XCM, MRI, Ophys, etc. - More Web-services: processing each data modality --- ### Related Work - [CATMAID](http://catmaid.readthedocs.org/en/stable/) - [Neurodata Without Borders](http://nwb.org/) - [Keller Lab Block File Format](http://www.nature.com/nprot/journal/v10/n11/abs/nprot.2015.111.html) -- ### Further Lowering the Barrier to Entry - Put it all together - Apply to other domains & questions - We work together! -- ### References 1. Burns B et al. [The Open Connectome Project Data Cluster: Scalaballe Analysis and Vision for High-Throughput Neuroscience](http://arxiv.org/abs/1306.3543). Scientific and Statistical Database Management (SSDBM), 2013. 1. Burns B, Vogelstein JT, Szalay AS. [From Cosmos to Connectomes: the Evolution of Data-Intensive Science](http://www.sciencedirect.com/science/article/pii/S0896627314007466). Neuron, 2015. --- name: fam layout: false class: center # NeuroData Family .center[ | | | | | :--- | :--- | :--- | | .r[Collect] | | .r[Clay Reid, Davi Bock, Jeff Licthman, Bobb Kasthuri, Karl Deisseroth, Raju Tomer, Li Ye, Ailey Crow, Ed Boyden, Mike Milham, Cameron Craddock, Stephen Smith, Forrest Collman, Kristen Harris, Scott Emmons, Dan Bumbarger, Mitya Chklovskii, Nikhil Bhatla, Nelson Spruston, Erik Bloss] | .orange[Store] | | .orange[Randal Burns, Eric Perlman, Kunal Lillaney, Priya Manavalan, Alex Eusman] | .ye[Explore] | | .ye[Alex Baden, Ivan Kuznetsov, David Marchette, Leo Duan, Albert Lee] | .g[Analyze] | | .g[Mike Miller, Nicholas Charon, Misha Kazhdan, Jordan Matelsky, Kwame Kutten, Greg Kiar, Eric Bridgeford, Greg Hager, Will Gray Roncal, Mark Chevillet, Dean Kleissas, R. Jacob Vogelstein, Guillermo Sapiro, Anish Simhal, Konrad Kording, Eva Dyer] | .blue[Model] | | .blue[Joshua T. Vogelstein, Carey Priebe, Dan Naiman, Tyler Tomita, Youngser Park, Jesse L. Patsolic, Leo Duan, Cencheng Shen, Ivan Kuznetsov] | .purple[Graphs] | | .purple[Da Zheng, Disa Mhembere, Vince Lyzinski, Avanti Athreya, Daniel Sussman, Shangsi Wang, Runze Tang, Minh Tang] | .pink[Love] | | .pink[Yummy, Family, Friends, Earth, Universe, Multiverse?] ] --- class: middle, center # Questions? _____ Funding NIH: {CRCNS, BRAINI, TRA} NSF: BIGDATA DARPA: {XDATA,GRAPHS,SIMPLEX} IARPA: MICrONS ____ w: [neurodata.io](http://neurodata.io) d: [docs.neurodata.io](http://docs.neurodata.io) e: [support@neurodata.io](mailto:support@neurodata.io)