Learning Apache spark,including code and data .Most part can run local.
Spark MLlib Learning
Apache CarbonData Learning
SparkLearning_NoData, including code,pom and so on
Adam Learning (bigdatagenomics)
Spark Library for Hadoop Upserts And Incrementals
java native interface,include java,scala,c,c++
Example code from Learning Spark book
Spark the Smith-Waterman algorithm
Graphs for Everyone
Deeplearning framework running on Spark
Notes talking about the design and implementation of Apache Spark
MapReduce, Spark, Java, and Scala for Data Algorithms Book
spark ml 算法原理剖析以及具体的源码实现分析
BigDL: Distributed Deep Learning Library for Apache Spark
Apache CarbonData 源码阅读
酷玩 Spark: Spark 源代码解析、Spark 类库等
GCDSS:Distributed Gene Clinical Decision Support System Based on Cloud Computing
winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
Optimized data access for AI based on CarbonData files
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Repo for counting stars and contributing. Press F to pay respect to glorious developers.
2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记
:metal: LabelImg is a graphical image annotation tool and label object bounding boxes in images
Android USB tethering driver for Mac OS X
ESC-50: Dataset for Environmental Sound Classification
Deep Learning Book Chinese Translation
Tesseract Open Source OCR Engine (main repository)
python implementation of the parquet columnar file format.
Mirror of Apache HBase
Mirror of Apache Hive
Non official project based on original /r/Deepfakes thread. Many thanks to him!
Caffe: a fast open framework for deep learning.
Publications： research result
Mirror of Apache CarbonData Site
thyroid knowledge learning
Open Hackathon Platform
Distributed object store
Bitcoin Core integration/staging tree
Example Play Scala application showing REST API
REST API using Play in Java
A delightful community-driven (with 1,100+ contributors) framework for managing your zsh configuration. Includes 200+ optional plugins (rails, git, OSX, hub, capistrano, brew, ant, php, python, etc), over 140 themes to spice up your morning, and an auto-update tool so that makes it easy to keep up with the latest updates from the community.
Open Source Computer Vision Library
Mirror of Apache CarbonData
Dubbo is a distributed, high performance RPC framework enpowering applications with service import/export capabilities.
Docker build for Apache Spark
Documentation for Docker Official Images in docker-library
General-purpose web UI for Kubernetes clusters
2017年买房经历总结出来的买房购房知识分享给大家，希望对大家有所帮助。买房不易，且买且珍惜。Sharing the knowledge of buy an own house that according to the experience at hangzhou in 2017 to all the people. It's not easy to buy a own house, so I hope that it would be useful to everyone.
OKCoin Rest Api客户端示例 目前只提供C++、C# 、Java、PHP、Python
spark summit 2017 SanFrancisco
Build Spark Batch/Streaming/MLlib Application by SQL
Java implementation of the Ethereum yellowpaper
Py4J enables Python programs to dynamically access arbitrary Java objects
TPC-DS benchmark kit with some modifications/fixes
Deep Learning Pipelines for Apache Spark
Production-Grade Container Scheduling and Management
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Microsoft Machine Learning for Apache Spark
ServiceComb Java Chassis is a Software Development Kit (SDK) for rapid development of microservices in Java, providing service registration, service discovery, dynamic routing, and service management features
Mastering Apache Spark 2
Mirror of Apache Spark
Cloud Native Landscape
Distributed SQL query engine for big data
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
CloudBWA: a distributed read mapping algorithms in GCDSS
Tools to process LIANTI sequence data
A single molecule sequence assembler for genomes large and small.
Parallel alignment using SNAP on ADAM. Apache 2 licensed.
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark and Parquet. Apache 2 licensed.
Deep Learning library for Python. Convnets, recurrent neural networks, and more. Runs on Theano or TensorFlow.
Java Bindings (JNI) for bwa
Spark-GATK is a genomics analysis framwork based on Apache Spark and ADAM.
TensorFlowOnSpark brings TensorFlow programs onto Apache Spark clusters
SparkBWA is a new tool that exploits the capabilities of a Big Data technology as Apache Spark to boost the performance of one of the most widely adopted sequence aligner, the Burrows-Wheeler Aligner (BWA).
Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy. Avoid writing scripts or custom code to deploy and update your applications— automate in a language that approaches plain English, using SSH, with no agents to install on remote systems.
An experimental distributed execution engine
An experimental distributed execution engine
A simple library class which helps with loading dynamic JNI libraries stored in the JAR archive
Big Data, MapReduce, Spark, PySpark, Java @ Santa Clara University, Fall 2016
A customisable 3D platform for agent-based AI research
Play 2.5, ScalaJS, Binding.scala starter project.
Library for SIMD accelerated optimal Sequence Alignments
SIMD C/C++ library for massive optimal sequence alignment
Smith-Waterman database searches with inter-sequence SIMD parallelisation
Formalized Bioinformatics and Reliable Software
Lightweight, super fast C/C++ library for sequence alignment using edit (Levenshtein) distance.
Java JNI interface for the parasail C library.
Pairwise Sequence Alignment Library
basic Scala hello world with JNI call (linux/emacs/bash/gcc)
A high performance and compression ratio compressor, powered by GTXLab of Genetalks.
distributed Smith-Waterman Algorithm
Java Native Access
Gecloud: Gene Data Analysis System Based on Cloud Computing Technology
Mirror of Apache Hadoop
Databricks Scala Coding Style Guide
Spark Tutorial Collection
Cloudflow is a MapReduce and Spark pipeline framework to simplify the pipeline creation in biomedical research, especially in the field of Genetics.
java 快速排序算法 demo
Mocking framework for unit tests written in Java
An awesome list of high-quality open datasets in public domains (on-going).
Bioinformatics for the Scala programming language
Breeze is a numerical processing library for Scala.
Java JNI with Maven
Scala Hello World with JNI and RegisterNatives
A flexible framework for rapid genome analysis and interpretation
GATK Official Release Repository: contains the core MIT-licensed GATK framework, plus "protected" tools restricted to non-commercial use only
Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, HyperLogLogs, Bitmaps.
A fast and sensitive gapped read aligner
An independent, student-led replication of DeepMind's 2016 Nature publication, "Mastering the game of Go with deep neural networks and tree search" (Nature 529, 484-489, 28 Jan 2016), details of which can be found on their website https://deepmind.com/publications.html.
Fork from CloudBurst project to learn about sequence alignment in Hadoop
Read and variant metrics, useable for pipeline quality control purposes. Apache 2 licensed.
A scalable genome browser. Apache 2 licensed.
Beautiful math in all browsers
Integrating ADAM/Parquet with GATK
Mirror of Apache Kylin
Mirror of Apache Flink
Computation using data flow graphs for scalable machine learning
General (non-omics) code used across BDG products. Apache 2 licensed.
Notebook tools for Big Data Genomics. Apache 2 licensed.
A Variant Caller, Distributed. Apache 2 licensed.
Code to accompany Advanced Analytics with Spark from O'Reilly Media
Burrow-Wheeler Aligner for pairwise alignment between DNA sequences
Memory-Centric Virtual Distributed Storage System
GATK Official Release Repository: contains the core MIT-licensed GATK framework, free for all uses
Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data
eQTL analysis in Apache Spark
Simple PHP Silex App using NeoClient for Neo4j
|GCC Machine Description||3||99.61%||56|