SparkLearning Scala 344

Learning Apache spark,including code and data .Most part can run local.

MLlibLearning GCC Machine Description 40

Spark MLlib Learning

CarbonDataLearning Scala 29

Apache CarbonData Learning

SparkLearning_NoData Scala 8

SparkLearning_NoData, including code,pom and so on

GeneDataProcess Scala 4

Adam Learning (bigdatagenomics)

Homepage JavaScript 2


Spark2Learning Scala 2

spark-2.X Learning

hudi * Java 1

Spark Library for Hadoop Upserts And Incrementals

JNILearning C++ 1

java native interface,include java,scala,c,c++

learning-spark * Java 1

Example code from Learning Spark book

SparkSW * 1

Spark the Smith-Waterman algorithm

neo4j * Java 1

Graphs for Everyone

neo4j_java 1


deepspark * Java 1

Deeplearning framework running on Spark

SparkInternals * 1

Notes talking about the design and implementation of Apache Spark

data-algorithms-book * Java 1

MapReduce, Spark, Java, and Scala for Data Algorithms Book

spark-ml-source-analysis * 1

spark ml 算法原理剖析以及具体的源码实现分析

BigDL * Scala 1

BigDL: Distributed Deep Learning Library for Apache Spark

spark-terasort * Java 1

Spark Terasort

carbondata_guide * 1

Apache CarbonData 源码阅读

CoolplaySpark * Scala 1

酷玩 Spark: Spark 源代码解析、Spark 类库等

GCDSS Scala 1

GCDSS:Distributed Gene Clinical Decision Support System Based on Cloud Computing

liuyangzhang HTML 1

浏阳张彩瓦,含公司介绍,图片展示,新闻报道,注册预定等 * HTML 1

Futures-Java-demo * Java 0

API_Docs * 0

火币 API文档

thyroid 0

thyroid knowledge learning

open-hackathon * Python 0

Open Hackathon Platform

kcoin * Ruby 0

ambry * Java 0

Distributed object store

TeXmacs.scala * Scala 0

TODO * Groovy 0

todo list

bitcoin * C++ 0

Bitcoin Core integration/staging tree

play-scala-rest-api-example * Scala 0

Example Play Scala application showing REST API

play-java-rest-api-example * Java 0

REST API using Play in Java

oh-my-zsh * Shell 0

A delightful community-driven (with 1,100+ contributors) framework for managing your zsh configuration. Includes 200+ optional plugins (rails, git, OSX, hub, capistrano, brew, ant, php, python, etc), over 140 themes to spice up your morning, and an auto-update tool so that makes it easy to keep up with the latest updates from the community.

opencv-java-face-recognition * Java 0

opencv * C++ 0

Open Source Computer Vision Library

carbondata * Scala 0

Mirror of Apache CarbonData

tpch-dbgen * C 0

TPC-H dbgen

dubbo * Java 0

Dubbo is a distributed, high performance RPC framework enpowering applications with service import/export capabilities.

docker-spark * 0

Docker build for Apache Spark

KeepLearning 0


docs * Shell 0

Documentation for Docker Official Images in docker-library

dashboard * JavaScript 0

General-purpose web UI for Kubernetes clusters

hangzhou_house_knowledge * CSS 0

2017年买房经历总结出来的买房购房知识分享给大家,希望对大家有所帮助。买房不易,且买且珍惜。Sharing the knowledge of buy an own house that according to the experience at hangzhou in 2017 to all the people. It's not easy to buy a own house, so I hope that it would be useful to everyone.

OkcoinHuobiTickerSpider * Python 0

liveStrategyEngine * Python 0


rest * C 0

OKCoin Rest Api客户端示例 目前只提供C++、C# 、Java、PHP、Python

spark-summit-2017-SanFrancisco * 0

spark summit 2017 SanFrancisco

streamingpro * JavaScript 0

Build Spark Batch/Streaming/MLlib Application by SQL

ethereumj * Java 0

Java implementation of the Ethereum yellowpaper

py4j * Java 0

Py4J enables Python programs to dynamically access arbitrary Java objects

tpcds-kit * C 0

TPC-DS benchmark kit with some modifications/fixes

spark-deep-learning * Python 0

Deep Learning Pipelines for Apache Spark

Publications 0

Publications: research result

tensorflow-2 * Python 0


kubernetes * Go 0

Production-Grade Container Scheduling and Management

SparkRDMA * Scala 0

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

mmlspark * Scala 0

Microsoft Machine Learning for Apache Spark

tensorflow-1 * 0

图解tensorflow 源码

ServiceComb-Java-Chassis * Java 0

ServiceComb Java Chassis is a Software Development Kit (SDK) for rapid development of microservices in Java, providing service registration, service discovery, dynamic routing, and service management features

mastering-apache-spark-book * 0

Mastering Apache Spark 2

spark * Scala 0

Mirror of Apache Spark

landscape * 0

Cloud Native Landscape

presto * Java 0

Distributed SQL query engine for big data

antlr4 * Java 0

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

CloudBWA Shell 0

CloudBWA: a distributed read mapping algorithms in GCDSS

lianti * C 0

Tools to process LIANTI sequence data

FinalPaper 0

canu * C++ 0

A single molecule sequence assembler for genomes large and small.

snappea * Scala 0

Parallel alignment using SNAP on ADAM. Apache 2 licensed.

adam * Scala 0

ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark and Parquet. Apache 2 licensed.

keras * Python 0

Deep Learning library for Python. Convnets, recurrent neural networks, and more. Runs on Theano or TensorFlow.

jbwa * Java 0

Java Bindings (JNI) for bwa

gatk-1 * Java 0

GATK4 development

Spark-GATK * Scala 0

Spark-GATK is a genomics analysis framwork based on Apache Spark and ADAM.

mame * C++ 0


cloud-scale-bwamem * Scala 0

TensorFlowOnSpark * Python 0

TensorFlowOnSpark brings TensorFlow programs onto Apache Spark clusters

wgsim * C 0

Reads simulator

SparkBWA * C 0

SparkBWA is a new tool that exploits the capabilities of a Big Data technology as Apache Spark to boost the performance of one of the most widely adopted sequence aligner, the Burrows-Wheeler Aligner (BWA).

DSA Java 0


ansible * Python 0

Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy. Avoid writing scripts or custom code to deploy and update your applications— automate in a language that approaches plain English, using SSH, with no agents to install on remote systems.

CloudSW Java 0


ray * C 0

An experimental distributed execution engine

ray-legacy * Python 0

An experimental distributed execution engine

native-utils * Java 0

A simple library class which helps with loading dynamic JNI libraries stored in the JAR archive

big-data-mapreduce-course * Shell 0

Big Data, MapReduce, Spark, PySpark, Java @ Santa Clara University, Fall 2016

lab * C 0

A customisable 3D platform for agent-based AI research

Full-Stack-Scala-Starter * Scala 0

Play 2.5, ScalaJS, Binding.scala starter project.

libssa * C 0

Library for SIMD accelerated optimal Sequence Alignments

opal * C++ 0

SIMD C/C++ library for massive optimal sequence alignment

swipe * C++ 0

Smith-Waterman database searches with inter-sequence SIMD parallelisation

SHD_code * C 0

forbars * Coq 0

Formalized Bioinformatics and Reliable Software

edlib * C++ 0

Lightweight, super fast C/C++ library for sequence alignment using edit (Levenshtein) distance.

parasail-java * C 0

Java JNI interface for the parasail C library.

parasail * C 0

Pairwise Sequence Alignment Library

HelloWorldJNI * Scala 0

basic Scala hello world with JNI call (linux/emacs/bash/gcc)

gtz * Python 0

A high performance and compression ratio compressor, powered by GTXLab of Genetalks.

Complete-Striped-Smith-Waterman-Library * C 0


distributed Smith-Waterman Algorithm

jna * Java 0

Java Native Access

Gecloud 0

Gecloud: Gene Data Analysis System Based on Cloud Computing Technology

hadoop * Java 0

Mirror of Apache Hadoop

scala-style-guide * 0

Databricks Scala Coding Style Guide * HTML 0

Spark Tutorial Collection

cloudflow * Java 0

Cloudflow is a MapReduce and Spark pipeline framework to simplify the pipeline creation in biomedical research, especially in the field of Genetics. CSS 0

NCRELearning Java 0

FindJob Java 0

kechengliu 0

课程流是一个方便学生和老师的课程交流平台。老师可以在课程流上发布课程,包括课程信息、老师信息、助教信息、作业信息、课程信息、通知等;学生可以选择加入自己感兴趣的课程,也可以建立自己选择课程的课程主页,并在课程里面发布信息,进行问题交流,推荐学习资料等。 课程流旨在为学生和老师提供一个良好的学习交流平台,为非计算机专业或者烦于修改代码来创建课程主页的用户提供一个快捷方便课程主页简历和维护的平台,解决目前信息不集中和交流不便的现状,节省用户时间,提高效率,而且将课程主页汇到一起可以有选择性的学习自己没修的课程,便于自主学习和讨论交流。

Java_DEMO * Java 0


QuickSort * Java 0

java 快速排序算法 demo

mockito * Java 0

Mocking framework for unit tests written in Java

awesome-public-datasets * 0

An awesome list of high-quality open datasets in public domains (on-going).

bioscala * Scala 0

Bioinformatics for the Scala programming language

breeze * Scala 0

Breeze is a numerical processing library for Scala.

spark_meetup * 0

jni-maven * Java 0

Java JNI with Maven

HelloWorldJNIwithRegisterNatives * Scala 0

Scala Hello World with JNI and RegisterNatives

bwa-spark-0.2.3 * Scala 0

speedseq * C 0

A flexible framework for rapid genome analysis and interpretation

gatk-protected * Java 0

GATK Official Release Repository: contains the core MIT-licensed GATK framework, plus "protected" tools restricted to non-commercial use only

redis * C 0

Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, HyperLogLogs, Bitmaps.

bowtie2 * C++ 0

A fast and sensitive gapped read aligner

RocAlphaGo * Python 0

An independent, student-led replication of DeepMind's 2016 Nature publication, "Mastering the game of Go with deep neural networks and tree search" (Nature 529, 484-489, 28 Jan 2016), details of which can be found on their website

Markdown * 0

Markdown 基本语法。

CloudBurst * Java 0

Fork from CloudBurst project to learn about sequence alignment in Hadoop

qc-metrics * Scala 0

Read and variant metrics, useable for pipeline quality control purposes. Apache 2 licensed.

mango * Scala 0

A scalable genome browser. Apache 2 licensed.

MathJax * JavaScript 0

Beautiful math in all browsers

GAParquet * Scala 0

Integrating ADAM/Parquet with GATK

kylin * Java 0

Mirror of Apache Kylin

flink * Java 0

Mirror of Apache Flink

tensorflow * C++ 0

Computation using data flow graphs for scalable machine learning

utils * Scala 0

General (non-omics) code used across BDG products. Apache 2 licensed.

mango-notebook * JavaScript 0

Notebook tools for Big Data Genomics. Apache 2 licensed.

avocado * Scala 0

A Variant Caller, Distributed. Apache 2 licensed.

aas * Scala 0

Code to accompany Advanced Analytics with Spark from O'Reilly Media

JavaTraining Java 0

ScalaForImpatient Scala 0

bwa * C 0

Burrow-Wheeler Aligner for pairwise alignment between DNA sequences

alluxio * Java 0

Memory-Centric Virtual Distributed Storage System

WeChat * C# 0


wechat-deleted-friends * Python 0


gatk * Java 0

GATK Official Release Repository: contains the core MIT-licensed GATK framework, free for all uses

snap * C++ 0

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data

spark_eqtl * Python 0

eQTL analysis in Apache Spark

neo4j_php_examples 0


neo4j-neoclient-example * HTML 0

Simple PHP Silex App using NeoClient for Neo4j

