Learning Apache spark,including code and data .Most part can run local.
Spark MLlib Learning
Apache CarbonData Learning
SparkLearning_NoData, including code,pom and so on
Adam Learning (bigdatagenomics)
课程流是一个方便学生和老师的课程交流平台。老师可以在课程流上发布课程,包括课程信息、老师信息、助教信息、作业信息、课程信息、通知等;学生可以选择加入自己感兴趣的课程,也可以建立自己选择课程的课程主页,并在课程里面发布信息,进行问题交流,推荐学习资料等。 课程流旨在为学生和老师提供一个良好的学习交流平台,为非计算机专业或者烦于修改代码来创建课程主页的用户提供一个快捷方便课程主页简历和维护的平台,解决目前信息不集中和交流不便的现状,节省用户时间,提高效率,而且将课程主页汇到一起可以有选择性的学习自己没修的课程,便于自主学习和讨论交流。
spark-2.X Learning
Homepage是一个个人主页的模版,使用的是静态网页技术,可以在线展示个人信息
浏阳张彩瓦,含公司介绍,图片展示,新闻报道,注册预定等
Spark Library for Hadoop Upserts And Incrementals
java native interface,include java,scala,c,c++
Example code from Learning Spark book
Spark the Smith-Waterman algorithm
Graphs for Everyone
neo4j、java、Cypher
Deeplearning framework running on Spark
Notes talking about the design and implementation of Apache Spark
MapReduce, Spark, Java, and Scala for Data Algorithms Book
spark ml 算法原理剖析以及具体的源码实现分析
BigDL: Distributed Deep Learning Library for Apache Spark
Spark Terasort
Apache CarbonData 源码阅读
酷玩 Spark: Spark 源代码解析、Spark 类库等
GCDSS:Distributed Gene Clinical Decision Support System Based on Cloud Computing
DSA
Deep Residual Learning for Image Recognition
based on Spring-Cloud `Finchley` version.
PaddleSeg is a high performance semantic segmentation toolkit based on PaddlePaddle. (『飞桨』图像分割库)
winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
Optimized data access for AI based on CarbonData files
ModelArts开发者案例交流互动平台,@ModelArts服务官网:https://www.huaweicloud.com/product/modelarts.html
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
AILearning
Repo for counting stars and contributing. Press F to pay respect to glorious developers.
2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记
:metal: LabelImg is a graphical image annotation tool and label object bounding boxes in images
深圳买房指南
《21个项目玩转深度学习———基于TensorFlow的实践详解》配套代码
Android USB tethering driver for Mac OS X
ESC-50: Dataset for Environmental Sound Classification
Deep Learning Book Chinese Translation
Tesseract Open Source OCR Engine (main repository)
python implementation of the parquet columnar file format.
Apache Parquet
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.
Mirror of Apache HBase
Mirror of Apache Hive
Non official project based on original /r/Deepfakes thread. Many thanks to him!
Caffe: a fast open framework for deep learning.
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
Publications: research result
Mirror of Apache CarbonData Site
火币 API文档
thyroid knowledge learning
Open Hackathon Platform
Distributed object store
todo list
Bitcoin Core integration/staging tree
Example Play Scala application showing REST API
REST API using Play in Java
A delightful community-driven (with 1,100+ contributors) framework for managing your zsh configuration. Includes 200+ optional plugins (rails, git, OSX, hub, capistrano, brew, ant, php, python, etc), over 140 themes to spice up your morning, and an auto-update tool so that makes it easy to keep up with the latest updates from the community.
Open Source Computer Vision Library
Mirror of Apache CarbonData
TPC-H dbgen
Dubbo is a distributed, high performance RPC framework enpowering applications with service import/export capabilities.
Docker build for Apache Spark
KeepLearning
Documentation for Docker Official Images in docker-library
General-purpose web UI for Kubernetes clusters
2017年买房经历总结出来的买房购房知识分享给大家,希望对大家有所帮助。买房不易,且买且珍惜。Sharing the knowledge of buy an own house that according to the experience at hangzhou in 2017 to all the people. It's not easy to buy a own house, so I hope that it would be useful to everyone.
WeQuant微宽网-比特币量化交易/优质策略源码/精准回测/免费实盘,尽在微宽网
OKCoin Rest Api客户端示例 目前只提供C++、C# 、Java、PHP、Python
spark summit 2017 SanFrancisco
Build Spark Batch/Streaming/MLlib Application by SQL
Java implementation of the Ethereum yellowpaper
Py4J enables Python programs to dynamically access arbitrary Java objects
TPC-DS benchmark kit with some modifications/fixes
Deep Learning Pipelines for Apache Spark
使用谷歌开源的TensorFlow进行一系列的训练实践
Production-Grade Container Scheduling and Management
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Microsoft Machine Learning for Apache Spark
图解tensorflow 源码
ServiceComb Java Chassis is a Software Development Kit (SDK) for rapid development of microservices in Java, providing service registration, service discovery, dynamic routing, and service management features
Mastering Apache Spark 2
Mirror of Apache Spark
Cloud Native Landscape
Distributed SQL query engine for big data
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
CloudBWA: a distributed read mapping algorithms in GCDSS
Tools to process LIANTI sequence data
A single molecule sequence assembler for genomes large and small.
Parallel alignment using SNAP on ADAM. Apache 2 licensed.
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark and Parquet. Apache 2 licensed.
Deep Learning library for Python. Convnets, recurrent neural networks, and more. Runs on Theano or TensorFlow.
Java Bindings (JNI) for bwa
GATK4 development
Spark-GATK is a genomics analysis framwork based on Apache Spark and ADAM.
MAME
TensorFlowOnSpark brings TensorFlow programs onto Apache Spark clusters
Reads simulator
SparkBWA is a new tool that exploits the capabilities of a Big Data technology as Apache Spark to boost the performance of one of the most widely adopted sequence aligner, the Burrows-Wheeler Aligner (BWA).
Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy. Avoid writing scripts or custom code to deploy and update your applications— automate in a language that approaches plain English, using SSH, with no agents to install on remote systems.
CloudSW
An experimental distributed execution engine
An experimental distributed execution engine
A simple library class which helps with loading dynamic JNI libraries stored in the JAR archive
Big Data, MapReduce, Spark, PySpark, Java @ Santa Clara University, Fall 2016
A customisable 3D platform for agent-based AI research
Play 2.5, ScalaJS, Binding.scala starter project.
Library for SIMD accelerated optimal Sequence Alignments
SIMD C/C++ library for massive optimal sequence alignment
Smith-Waterman database searches with inter-sequence SIMD parallelisation
Formalized Bioinformatics and Reliable Software
Lightweight, super fast C/C++ library for sequence alignment using edit (Levenshtein) distance.
Java JNI interface for the parasail C library.
Pairwise Sequence Alignment Library
basic Scala hello world with JNI call (linux/emacs/bash/gcc)
A high performance and compression ratio compressor, powered by GTXLab of Genetalks.
distributed Smith-Waterman Algorithm
Java Native Access
Gecloud: Gene Data Analysis System Based on Cloud Computing Technology
Mirror of Apache Hadoop
Databricks Scala Coding Style Guide
Spark Tutorial Collection
Cloudflow is a MapReduce and Spark pipeline framework to simplify the pipeline creation in biomedical research, especially in the field of Genetics.
生产者/消费者模型
java 快速排序算法 demo
Mocking framework for unit tests written in Java
An awesome list of high-quality open datasets in public domains (on-going).
Bioinformatics for the Scala programming language
Breeze is a numerical processing library for Scala.
Java JNI with Maven
Scala Hello World with JNI and RegisterNatives
A flexible framework for rapid genome analysis and interpretation
GATK Official Release Repository: contains the core MIT-licensed GATK framework, plus "protected" tools restricted to non-commercial use only
Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, HyperLogLogs, Bitmaps.
A fast and sensitive gapped read aligner
An independent, student-led replication of DeepMind's 2016 Nature publication, "Mastering the game of Go with deep neural networks and tree search" (Nature 529, 484-489, 28 Jan 2016), details of which can be found on their website https://deepmind.com/publications.html.
Markdown 基本语法。
Fork from CloudBurst project to learn about sequence alignment in Hadoop
Read and variant metrics, useable for pipeline quality control purposes. Apache 2 licensed.
A scalable genome browser. Apache 2 licensed.
Beautiful math in all browsers
Integrating ADAM/Parquet with GATK
Mirror of Apache Kylin
Mirror of Apache Flink
Computation using data flow graphs for scalable machine learning
General (non-omics) code used across BDG products. Apache 2 licensed.
Notebook tools for Big Data Genomics. Apache 2 licensed.
A Variant Caller, Distributed. Apache 2 licensed.
Code to accompany Advanced Analytics with Spark from O'Reilly Media
Burrow-Wheeler Aligner for pairwise alignment between DNA sequences
Memory-Centric Virtual Distributed Storage System
微信电脑客户端
查看被删的微信好友
GATK Official Release Repository: contains the core MIT-licensed GATK framework, free for all uses
Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data
eQTL analysis in Apache Spark
neo4j,php,examples,neoclient
Simple PHP Silex App using NeoClient for Neo4j
编程语言 | 排名 | 好于 | 星星数 |
---|---|---|---|
GCC Machine Description | 3 | 99.71% | 58 |
Scala | 5 | 99.94% | 533 |
HTML | 5739 | 90.39% | 2 |
Java | 10141 | 87.64% | 6 |
JavaScript | 19010 | 78.10% | 2 |