系统配置:
- Spark属性:控制大部分的应用程序参数,可以用SparkConf对象或者Java系统属性设置
- 环境变量:可以通过每个节点的conf/spark-env.sh脚本设置。例如IP地址、端口等信息
- 日志配置:可以通过log4j.properties配置
1. Spark 属性
These properties can be set directly on a SparkConf passed to your SparkContext. SparkConf allows you to configure some of the common properties (e.g. master URL and application name), as well as arbitrary key-value pairs through the set() method. For example, we could initialize an application with two threads as follows:
Note that we run with local[2], meaning two threads - which represents “minimal” parallelism, which can help detect bugs that only exist when we run in a distributed context.
bin/spark-submit will also read configuration options from conf/spark-defaults.conf, in which each line consists of a key and a value separated by whitespace. For example:
优先级: