Bo Xu

Shenzhen, China

SparkLearning Scala 570

Learning Apache spark,including code and data .Most part can run local.

MLlibLearning GCC Machine Description 68

Spark MLlib Learning

CarbonDataLearning Scala 51

Apache CarbonData Learning

SparkLearning_NoData Scala 11

SparkLearning_NoData, including code,pom and so on

GeneDataProcess Scala 5

Adam Learning (bigdatagenomics)

kechengliu 4

课程流是一个方便学生和老师的课程交流平台。老师可以在课程流上发布课程,包括课程信息、老师信息、助教信息、作业信息、课程信息、通知等;学生可以选择加入自己感兴趣的课程,也可以建立自己选择课程的课程主页,并在课程里面发布信息,进行问题交流,推荐学习资料等。 课程流旨在为学生和老师提供一个良好的学习交流平台,为非计算机专业或者烦于修改代码来创建课程主页的用户提供一个快捷方便课程主页简历和维护的平台,解决目前信息不集中和交流不便的现状,节省用户时间,提高效率,而且将课程主页汇到一起可以有选择性的学习自己没修的课程,便于自主学习和讨论交流。

Spark2Learning Scala 3

spark-2.X Learning

JNILearning C++ 2

java native interface,include java,scala,c,c++

Homepage JavaScript 2


CloudBWA Shell 1

CloudBWA: a distributed read mapping algorithms in GCDSS

Algorithm_Interview_Notes-Chinese * Python 1

2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记

liuyangzhang HTML 1


hudi * Java 1

Spark Library for Hadoop Upserts And Incrementals

learning-spark * Java 1

Example code from Learning Spark book

SparkSW * 1

Spark the Smith-Waterman algorithm

neo4j * Java 1

Graphs for Everyone

neo4j_java 1


deepspark * Java 1

Deeplearning framework running on Spark

SparkInternals * 1

Notes talking about the design and implementation of Apache Spark

data-algorithms-book * Java 1

MapReduce, Spark, Java, and Scala for Data Algorithms Book

spark-ml-source-analysis * 1

spark ml 算法原理剖析以及具体的源码实现分析

BigDL * Scala 1

BigDL: Distributed Deep Learning Library for Apache Spark

spark-terasort * Java 1

Spark Terasort

carbondata_guide * 1

Apache CarbonData 源码阅读

CoolplaySpark * Scala 1

酷玩 Spark: Spark 源代码解析、Spark 类库等

GCDSS Scala 1

GCDSS:Distributed Gene Clinical Decision Support System Based on Cloud Computing * HTML 1

mmlspark * Scala 0

Microsoft Machine Learning for Apache Spark

datumaro * 0

Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.

springboot-learning-example * 0

spring boot 实践学习案例,是 spring boot 初学者及核心技术巩固的最佳实践。另外写博客,用 OpenWrite。

AILearning Python 0


incubator-livy * 0

Mirror of Apache livy (Incubating)

ansible * Python 0

Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy. Avoid writing scripts or custom code to deploy and update your applications— automate in a language that approaches plain English, using SSH, with no agents to install on remote systems.

amazon-sagemaker-examples * 0

Example notebooks that show how to apply machine learning, deep learning and reinforcement learning in Amazon SageMaker

redis * C 0

Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, HyperLogLogs, Bitmaps.

VideoTools Java 0

cocoapi * 0

COCO API - Dataset @

API_Docs * 0

火币 API文档

Futures-Java-demo * Java 0

mastering-apache-spark-book * 0

Mastering Apache Spark 2

VoTT * 0

Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos.

tensorflow-2 * Python 0


oh-my-zsh * Shell 0

A delightful community-driven (with 1,100+ contributors) framework for managing your zsh configuration. Includes 200+ optional plugins (rails, git, OSX, hub, capistrano, brew, ant, php, python, etc), over 140 themes to spice up your morning, and an auto-update tool so that makes it easy to keep up with the latest updates from the community.

DSA Java 0


deep-residual-networks * 0

Deep Residual Learning for Image Recognition

Spring-Cloud_Samples * 0

based on Spring-Cloud `Finchley` version.

PaddleSeg * 0

PaddleSeg is a high performance semantic segmentation toolkit based on PaddlePaddle. (『飞桨』图像分割库)

winutils * Shell 0

winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows

ModelArts-Lab * Jupyter Notebook 0


delta * Scala 0

An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.

996.ICU * Rust 0

Repo for counting stars and contributing. Press F to pay respect to glorious developers.

modelarts-dataset-sdk * Python 0

labelImg * Python 0

:metal: LabelImg is a graphical image annotation tool and label object bounding boxes in images

shenzhen_home * 0


Deep-Learning-21-Examples * Python 0


HoRNDIS * C++ 0

Android USB tethering driver for Mac OS X

ESC-50 * Python 0

ESC-50: Dataset for Environmental Sound Classification

deeplearningbook-chinese * TeX 0

Deep Learning Book Chinese Translation

tesseract * C++ 0

Tesseract Open Source OCR Engine (main repository)

parquet-python * Python 0

python implementation of the parquet columnar file format.

parquet-mr * Java 0

Apache Parquet

arrow * C++ 0

Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.

hbase * Java 0

Mirror of Apache HBase

hive * Java 0

Mirror of Apache Hive

faceswap * Python 0

Non official project based on original /r/Deepfakes thread. Many thanks to him!

caffe * C++ 0

Caffe: a fast open framework for deep learning.

incubator-mxnet * C++ 0

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Publications 0

Publications: research result

carbondata-site * HTML 0

Mirror of Apache CarbonData Site

thyroid 0

thyroid knowledge learning

open-hackathon * Python 0

Open Hackathon Platform

kcoin * Ruby 0

ambry * Java 0

Distributed object store

TeXmacs.scala * Scala 0

TODO * Groovy 0

todo list

bitcoin * C++ 0

Bitcoin Core integration/staging tree

play-scala-rest-api-example * Scala 0

Example Play Scala application showing REST API

play-java-rest-api-example * Java 0

REST API using Play in Java

opencv-java-face-recognition * Java 0

opencv * C++ 0

Open Source Computer Vision Library

carbondata * Scala 0

Mirror of Apache CarbonData

tpch-dbgen * C 0

TPC-H dbgen

dubbo * Java 0

Dubbo is a distributed, high performance RPC framework enpowering applications with service import/export capabilities.

docker-spark * 0

Docker build for Apache Spark

KeepLearning 0


docs * Shell 0

Documentation for Docker Official Images in docker-library

dashboard * JavaScript 0

General-purpose web UI for Kubernetes clusters

hangzhou_house_knowledge * CSS 0

2017年买房经历总结出来的买房购房知识分享给大家,希望对大家有所帮助。买房不易,且买且珍惜。Sharing the knowledge of buy an own house that according to the experience at hangzhou in 2017 to all the people. It's not easy to buy a own house, so I hope that it would be useful to everyone.

OkcoinHuobiTickerSpider * Python 0

liveStrategyEngine * Python 0


rest * C 0

OKCoin Rest Api客户端示例 目前只提供C++、C# 、Java、PHP、Python

spark-summit-2017-SanFrancisco * 0

spark summit 2017 SanFrancisco

streamingpro * JavaScript 0

Build Spark Batch/Streaming/MLlib Application by SQL

ethereumj * Java 0

Java implementation of the Ethereum yellowpaper

py4j * Java 0

Py4J enables Python programs to dynamically access arbitrary Java objects

tpcds-kit * C 0

TPC-DS benchmark kit with some modifications/fixes

spark-deep-learning * Python 0

Deep Learning Pipelines for Apache Spark

kubernetes * Go 0

Production-Grade Container Scheduling and Management

SparkRDMA * Scala 0

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

tensorflow-1 * 0

图解tensorflow 源码

ServiceComb-Java-Chassis * Java 0

ServiceComb Java Chassis is a Software Development Kit (SDK) for rapid development of microservices in Java, providing service registration, service discovery, dynamic routing, and service management features

spark * Scala 0

Mirror of Apache Spark

landscape * 0

Cloud Native Landscape

presto * Java 0

Distributed SQL query engine for big data

antlr4 * Java 0

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

lianti * C 0

Tools to process LIANTI sequence data

FinalPaper 0

canu * C++ 0

A single molecule sequence assembler for genomes large and small.

snappea * Scala 0

Parallel alignment using SNAP on ADAM. Apache 2 licensed.

adam * Scala 0

ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark and Parquet. Apache 2 licensed.

keras * Python 0

Deep Learning library for Python. Convnets, recurrent neural networks, and more. Runs on Theano or TensorFlow.

jbwa * Java 0

Java Bindings (JNI) for bwa

gatk-1 * Java 0

GATK4 development

Spark-GATK * Scala 0

Spark-GATK is a genomics analysis framwork based on Apache Spark and ADAM.

mame * C++ 0


cloud-scale-bwamem * Scala 0

TensorFlowOnSpark * Python 0

TensorFlowOnSpark brings TensorFlow programs onto Apache Spark clusters

wgsim * C 0

Reads simulator

SparkBWA * C 0

SparkBWA is a new tool that exploits the capabilities of a Big Data technology as Apache Spark to boost the performance of one of the most widely adopted sequence aligner, the Burrows-Wheeler Aligner (BWA).

CloudSW Java 0


ray * C 0

An experimental distributed execution engine

ray-legacy * Python 0

An experimental distributed execution engine

native-utils * Java 0

A simple library class which helps with loading dynamic JNI libraries stored in the JAR archive

big-data-mapreduce-course * Shell 0

Big Data, MapReduce, Spark, PySpark, Java @ Santa Clara University, Fall 2016

lab * C 0

A customisable 3D platform for agent-based AI research

Full-Stack-Scala-Starter * Scala 0

Play 2.5, ScalaJS, Binding.scala starter project.

libssa * C 0

Library for SIMD accelerated optimal Sequence Alignments

opal * C++ 0

SIMD C/C++ library for massive optimal sequence alignment

swipe * C++ 0

Smith-Waterman database searches with inter-sequence SIMD parallelisation

SHD_code * C 0

forbars * Coq 0

Formalized Bioinformatics and Reliable Software

edlib * C++ 0

Lightweight, super fast C/C++ library for sequence alignment using edit (Levenshtein) distance.

parasail-java * C 0

Java JNI interface for the parasail C library.

parasail * C 0

Pairwise Sequence Alignment Library

HelloWorldJNI * Scala 0

basic Scala hello world with JNI call (linux/emacs/bash/gcc)

gtz * Python 0

A high performance and compression ratio compressor, powered by GTXLab of Genetalks.

Complete-Striped-Smith-Waterman-Library * C 0


distributed Smith-Waterman Algorithm

jna * Java 0

Java Native Access

Gecloud 0

Gecloud: Gene Data Analysis System Based on Cloud Computing Technology

hadoop * Java 0

Mirror of Apache Hadoop

scala-style-guide * 0

Databricks Scala Coding Style Guide * HTML 0

Spark Tutorial Collection

cloudflow * Java 0

Cloudflow is a MapReduce and Spark pipeline framework to simplify the pipeline creation in biomedical research, especially in the field of Genetics. CSS 0

NCRELearning Java 0

FindJob Java 0

Java_DEMO * Java 0


QuickSort * Java 0

java 快速排序算法 demo

mockito * Java 0

Mocking framework for unit tests written in Java

awesome-public-datasets * 0

An awesome list of high-quality open datasets in public domains (on-going).

bioscala * Scala 0

Bioinformatics for the Scala programming language

breeze * Scala 0

Breeze is a numerical processing library for Scala.

spark_meetup * 0

jni-maven * Java 0

Java JNI with Maven

HelloWorldJNIwithRegisterNatives * Scala 0

Scala Hello World with JNI and RegisterNatives

bwa-spark-0.2.3 * Scala 0

speedseq * C 0

A flexible framework for rapid genome analysis and interpretation

gatk-protected * Java 0

GATK Official Release Repository: contains the core MIT-licensed GATK framework, plus "protected" tools restricted to non-commercial use only

bowtie2 * C++ 0

A fast and sensitive gapped read aligner

RocAlphaGo * Python 0

An independent, student-led replication of DeepMind's 2016 Nature publication, "Mastering the game of Go with deep neural networks and tree search" (Nature 529, 484-489, 28 Jan 2016), details of which can be found on their website

Markdown * 0

Markdown 基本语法。

CloudBurst * Java 0

Fork from CloudBurst project to learn about sequence alignment in Hadoop

qc-metrics * Scala 0

Read and variant metrics, useable for pipeline quality control purposes. Apache 2 licensed.

mango * Scala 0

A scalable genome browser. Apache 2 licensed.

MathJax * JavaScript 0

Beautiful math in all browsers

GAParquet * Scala 0

Integrating ADAM/Parquet with GATK

kylin * Java 0

Mirror of Apache Kylin

flink * Java 0

Mirror of Apache Flink

tensorflow * C++ 0

Computation using data flow graphs for scalable machine learning

utils * Scala 0

General (non-omics) code used across BDG products. Apache 2 licensed.

mango-notebook * JavaScript 0

Notebook tools for Big Data Genomics. Apache 2 licensed.

avocado * Scala 0

A Variant Caller, Distributed. Apache 2 licensed.

aas * Scala 0

Code to accompany Advanced Analytics with Spark from O'Reilly Media

JavaTraining Java 0

ScalaForImpatient Scala 0

bwa * C 0

Burrow-Wheeler Aligner for pairwise alignment between DNA sequences

alluxio * Java 0

Memory-Centric Virtual Distributed Storage System

WeChat * C# 0


wechat-deleted-friends * Python 0


gatk * Java 0

GATK Official Release Repository: contains the core MIT-licensed GATK framework, free for all uses

snap * C++ 0

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data

spark_eqtl * Python 0

eQTL analysis in Apache Spark

neo4j_php_examples 0


neo4j-neoclient-example * HTML 0

Simple PHP Silex App using NeoClient for Neo4j

编程语言 排名 好于 星星数
GCC Machine Description 4 99.60% 68
Scala 5 99.94% 643
HTML 6280 89.85% 2
Java 10772 87.08% 6
JavaScript 20079 77.32% 2
更新于2021-09-27 18:41:42