Lijie Xu

Institute of Software, CAS

SparkInternals 4638

Notes talking about the design and implementation of Apache Spark

SparkLearning Scala 154

Learning to write Spark examples

MyNotes Python 102

Self-written notes that may be useful

ApacheSparkBook Scala 52

blogs 44

My blogs

SparkFaultBench Scala 12

A Spark Reliability Testing Suite

SparkProfiler Python 11

Profiling Spark Applications for Performance Comparison and Diagnosis

MySlides 9

Self-written slides that may be useful

SparkGC Scala 4 CSS 3

My Homepage

enhanced-Eclipse-MAT Java 2

extracting framework & user objects from task's heap dump

MyPaper 2

My papers and technique reports

streaming-benchmarks * Java 2

Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache Spark, Apache Flink, ...

Spark-core Scala 2

The core library from Spark

SparkBench Scala 2

Spark benchmark

Misc 1

Store miscellaneou things

SparkExamples Scala 1

Basic examples for learning Spark

HadoopJobInfoCollector Java 1

Fetch the configuration/timeline/counters/log infos from JobTracker

GraphExamples Scala 1

Examples of GraphX

spark-sql-perf * Scala 1

BenchmarkScripts Java 1

Benchmark scripts in master

FMEM Java 1

A Fine-grained Memory Estimator for MapReduce Jobs

SparkStreamingBench Scala 1

Testing SparkStreaming

madlib * C++ 0

Mirror of Apache MADlib

bandwidth-bench * 0

Measure memory and disk bandwidth using the random access size as a paramater.

postgres * 0

Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see

TR 0

My technical reports

moa * Java 0

MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.

AndroidGCProfiler Python 0

Profiling the GC activities in Android ART JVM * 0

My blog

GellyLearning 0

yahoo-streaming-benchmark * Java 0

An extension of Yahoo's Benchmarks

Mprof Java 0

A Memory Profiler for Diagnosing Memory Problems in MapReduce Applications

spark * Scala 0

Mirror of Apache Spark

hadoop-1.2.0-enhanced Java 0

Enhanced hadoop-1.2.0 by LJX

MemoryEstimator Python 0

MEMR Java 0

OOMJobs Java 0

MapReduce jobs that will cause OOME

GarageBand XML 0

My music project

fdps-vii * Scala 0

Code & data for Fast data processing with Spark V2

public * C++ 0

SailingLab's Petuum project.

parameter_server * C++ 0

A distributed machine learning framework.

OOMCases CSS 0

Real-world OOM cases in MapReduce jobs

Hadoop-0.20.2-LJX Java 0

A part of Hadoop-0.20.2 source code (Some MapReduce Framework related code has been modified by Lijie Xu).

SampleBenchmark Java 0

Hadoop Benchmark - Input splits are first sampled

DMEM Java 0

Dataflow and Memory Estimator for MR Jobs

PerformanceAnalysis Java 0

Visualize the metrics got from tasks' logs, Pidstata and JVM

BigDataBenchmark Java 0

Representative MapReduce Job for Hadoop Benchmark

编程语言 排名 好于 星星数
Scala 16 99.80% 230
Python 1872 97.94% 113
CSS 2621 94.14% 3
Java 9970 88.13% 7
更新于2021-09-24 03:24:11