2016-04-03 8 views
0

Ich bin neu in der Welt der Big Data und Hadoop Ich versuche, einen Code in Google laufen es bestehen aus vier Schritten, wie das Eingeben von Daten Hadoop-Dateisystem, dann Hinzufügen von Index zu den Daten dann der große Schritt, der eine reduzierte Daten mit Karte erstellen und reduzieren.Ausführen eines Hadoop-Programm in Single-Node-Cluster-Setup in Ubuntu 14.04 Hadoop Version 2.6

konnte ich die beiden ersten Schritt auszuführen: der Code verwendet xml die Lage zu handhaben:

der Code, den ich verwendet wird, ist http://asterixdb.ics.uci.edu/fuzzyjoin/

wenn ich das tue, den letzten Schritt, dass die Fuzzy-Join es gibt mir eine Reihe von Fehlern:

hiermit die Trace-Datei Befestigung:

[email protected]:/home/midhu/fuzzyjoin$ cd fuzzyjoin-hadoop 
[email protected]:/home/midhu/fuzzyjoin/fuzzyjoin-hadoop$ hadoop jar target/fuzzyjoin-hadoop-0.0.2-SNAPSHOT.jar fuzzyjoin -conf src/main/resources/fuzzyjoin/dblp.quickstart.xml 
16/04/03 13:55:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
Complete-Job started: Sun Apr 03 13:55:42 IST 2016 
Multi-Job started: Sun Apr 03 13:55:42 IST 2016 
FuzzyJoinDriver(TokensBasic.phase1) 
    Input Path: {hdfs://localhost:54310/user/hduser/dblp-small/records-000} 
    Output Path: hdfs://localhost:54310/user/hduser/dblp-small/tokens.phase1-000 
    Map Jobs: 2 
    Reduce Jobs: 1 
    Properties: {fuzzyjoin.similarity.name=Jaccard 
       fuzzyjoin.similarity.threshold=.5 
       fuzzyjoin.tokenizer=Word 
       fuzzyjoin.tokens.package=Scalar 
       fuzzyjoin.tokens.lengthstats=false 
       fuzzyjoin.ridpairs.group.class=TokenIdentity 
       fuzzyjoin.ridpairs.group.factor=1 
       fuzzyjoin.data.tokens= 
       fuzzyjoin.data.joinindex=} 
Job started: Sun Apr 03 13:55:42 IST 2016 
16/04/03 13:55:42 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 
16/04/03 13:55:42 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 
16/04/03 13:55:42 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 
16/04/03 13:55:43 INFO mapred.FileInputFormat: Total input paths to process : 1 
16/04/03 13:55:43 INFO mapreduce.JobSubmitter: number of splits:1 
16/04/03 13:55:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1780986358_0001 
16/04/03 13:55:44 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 
16/04/03 13:55:44 INFO mapreduce.Job: Running job: job_local1780986358_0001 
16/04/03 13:55:44 INFO mapred.LocalJobRunner: OutputCommitter set in config null 
16/04/03 13:55:44 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter 
16/04/03 13:55:45 INFO mapred.LocalJobRunner: Waiting for map tasks 
16/04/03 13:55:45 INFO mapred.LocalJobRunner: Starting task: attempt_local1780986358_0001_m_000000_0 
16/04/03 13:55:46 INFO mapreduce.Job: Job job_local1780986358_0001 running in uber mode : false 
16/04/03 13:55:46 INFO mapreduce.Job: map 0% reduce 0% 
16/04/03 13:55:46 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 
16/04/03 13:55:46 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/dblp-small/records-000/part-00000:0+36687 
16/04/03 13:55:46 INFO mapred.MapTask: numReduceTasks: 1 
16/04/03 13:55:49 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 
16/04/03 13:55:49 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 
16/04/03 13:55:49 INFO mapred.MapTask: soft limit at 83886080 
16/04/03 13:55:49 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 
16/04/03 13:55:49 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 
16/04/03 13:55:49 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
16/04/03 13:55:52 INFO mapred.LocalJobRunner: hdfs://localhost:54310/user/hduser/dblp-small/records-000/part-00000:0+36687 > map 
16/04/03 13:55:54 INFO mapred.LocalJobRunner: hdfs://localhost:54310/user/hduser/dblp-small/records-000/part-00000:0+36687 > map 
16/04/03 13:55:54 INFO mapred.MapTask: Starting flush of map output 
16/04/03 13:55:54 INFO mapred.MapTask: Spilling map output 
16/04/03 13:55:54 INFO mapred.MapTask: bufstart = 0; bufend = 15588; bufvoid = 104857600 
16/04/03 13:55:54 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26209408(104837632); length = 4989/6553600 
16/04/03 13:55:54 INFO mapred.MapTask: Finished spill 0 
16/04/03 13:55:54 INFO mapred.Task: Task:attempt_local1780986358_0001_m_000000_0 is done. And is in the process of committing 
16/04/03 13:55:54 INFO mapred.LocalJobRunner: hdfs://localhost:54310/user/hduser/dblp-small/records-000/part-00000:0+36687 
16/04/03 13:55:54 INFO mapred.Task: Task 'attempt_local1780986358_0001_m_000000_0' done. 
16/04/03 13:55:54 INFO mapred.LocalJobRunner: Finishing task: attempt_local1780986358_0001_m_000000_0 
16/04/03 13:55:54 INFO mapred.LocalJobRunner: map task executor complete. 
16/04/03 13:55:54 INFO mapred.LocalJobRunner: Waiting for reduce tasks 
16/04/03 13:55:54 INFO mapred.LocalJobRunner: Starting task: attempt_local1780986358_0001_r_000000_0 
16/04/03 13:55:54 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 
16/04/03 13:55:54 INFO mapreduce.Job: map 100% reduce 0% 
16/04/03 13:55:54 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: [email protected] 
16/04/03 13:55:54 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10 
16/04/03 13:55:54 INFO reduce.EventFetcher: attempt_local1780986358_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events 
16/04/03 13:55:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1780986358_0001_m_000000_0 decomp: 9062 len: 9066 to MEMORY 
16/04/03 13:55:56 INFO reduce.InMemoryMapOutput: Read 9062 bytes from map-output for attempt_local1780986358_0001_m_000000_0 
16/04/03 13:55:57 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 9062, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->9062 
16/04/03 13:55:57 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning 
16/04/03 13:55:57 INFO mapred.LocalJobRunner: 1/1 copied. 
16/04/03 13:55:57 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs 
16/04/03 13:55:57 INFO mapred.Merger: Merging 1 sorted segments 
16/04/03 13:55:57 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 9056 bytes 
16/04/03 13:55:57 INFO reduce.MergeManagerImpl: Merged 1 segments, 9062 bytes to disk to satisfy reduce memory limit 
16/04/03 13:55:57 INFO reduce.MergeManagerImpl: Merging 1 files, 9066 bytes from disk 
16/04/03 13:55:57 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce 
16/04/03 13:55:57 INFO mapred.Merger: Merging 1 sorted segments 
16/04/03 13:55:57 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 9056 bytes 
16/04/03 13:55:57 INFO mapred.LocalJobRunner: 1/1 copied. 
16/04/03 13:56:00 INFO mapred.LocalJobRunner: reduce > reduce 
16/04/03 13:56:00 INFO mapreduce.Job: map 100% reduce 100% 
16/04/03 13:56:01 INFO mapred.Task: Task:attempt_local1780986358_0001_r_000000_0 is done. And is in the process of committing 
16/04/03 13:56:01 INFO mapred.LocalJobRunner: reduce > reduce 
16/04/03 13:56:01 INFO mapred.Task: Task attempt_local1780986358_0001_r_000000_0 is allowed to commit now 
16/04/03 13:56:02 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1780986358_0001_r_000000_0' to hdfs://localhost:54310/user/hduser/dblp-small/tokens.phase1-000/_temporary/0/task_local1780986358_0001_r_000000 
16/04/03 13:56:02 INFO mapred.LocalJobRunner: reduce > reduce 
16/04/03 13:56:02 INFO mapred.Task: Task 'attempt_local1780986358_0001_r_000000_0' done. 
16/04/03 13:56:02 INFO mapred.LocalJobRunner: Finishing task: attempt_local1780986358_0001_r_000000_0 
16/04/03 13:56:02 INFO mapred.LocalJobRunner: reduce task executor complete. 
16/04/03 13:56:02 INFO mapreduce.Job: Job job_local1780986358_0001 completed successfully 
16/04/03 13:56:03 INFO mapreduce.Job: Counters: 38 
    File System Counters 
     FILE: Number of bytes read=1080562 
     FILE: Number of bytes written=1589660 
     FILE: Number of read operations=0 
     FILE: Number of large read operations=0 
     FILE: Number of write operations=0 
     HDFS: Number of bytes read=73374 
     HDFS: Number of bytes written=12847 
     HDFS: Number of read operations=15 
     HDFS: Number of large read operations=0 
     HDFS: Number of write operations=18 
    Map-Reduce Framework 
     Map input records=100 
     Map output records=1248 
     Map output bytes=15588 
     Map output materialized bytes=9066 
     Input split bytes=120 
     Combine input records=1248 
     Combine output records=597 
     Reduce input groups=597 
     Reduce shuffle bytes=9066 
     Reduce input records=597 
     Reduce output records=597 
     Spilled Records=1194 
     Shuffled Maps =1 
     Failed Shuffles=0 
     Merged Map outputs=1 
     GC time elapsed (ms)=176 
     CPU time spent (ms)=0 
     Physical memory (bytes) snapshot=0 
     Virtual memory (bytes) snapshot=0 
     Total committed heap usage (bytes)=241836032 
    Shuffle Errors 
     BAD_ID=0 
     CONNECTION=0 
     IO_ERROR=0 
     WRONG_LENGTH=0 
     WRONG_MAP=0 
     WRONG_REDUCE=0 
    File Input Format Counters 
     Bytes Read=36687 
    File Output Format Counters 
     Bytes Written=12847 
Job ended: Sun Apr 03 13:56:04 IST 2016 
The job took 21.44 seconds. 
FuzzyJoinDriver(TokensBasic.phase2) 
    Input Path: {hdfs://localhost:54310/user/hduser/dblp-small/tokens.phase1-000} 
    Output Path: hdfs://localhost:54310/user/hduser/dblp-small/tokens-000 
    Map Jobs: 2 
    Reduce Jobs: 1 
    Properties: {fuzzyjoin.similarity.name=Jaccard 
       fuzzyjoin.similarity.threshold=.5 
       fuzzyjoin.tokenizer=Word 
       fuzzyjoin.tokens.package=Scalar 
       fuzzyjoin.tokens.lengthstats=false 
       fuzzyjoin.ridpairs.group.class=TokenIdentity 
       fuzzyjoin.ridpairs.group.factor=1 
       fuzzyjoin.data.tokens= 
       fuzzyjoin.data.joinindex=} 
Job started: Sun Apr 03 13:56:04 IST 2016 
16/04/03 13:56:04 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 
16/04/03 13:56:04 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 
16/04/03 13:56:05 INFO mapred.FileInputFormat: Total input paths to process : 1 
16/04/03 13:56:05 INFO mapreduce.JobSubmitter: number of splits:1 
16/04/03 13:56:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local954589393_0002 
16/04/03 13:56:05 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 
16/04/03 13:56:05 INFO mapreduce.Job: Running job: job_local954589393_0002 
16/04/03 13:56:05 INFO mapred.LocalJobRunner: OutputCommitter set in config null 
16/04/03 13:56:05 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter 
16/04/03 13:56:05 INFO mapred.LocalJobRunner: Waiting for map tasks 
16/04/03 13:56:05 INFO mapred.LocalJobRunner: Starting task: attempt_local954589393_0002_m_000000_0 
16/04/03 13:56:05 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 
16/04/03 13:56:05 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/dblp-small/tokens.phase1-000/part-00000:0+12847 
16/04/03 13:56:05 INFO mapred.MapTask: numReduceTasks: 1 
16/04/03 13:56:06 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 
16/04/03 13:56:06 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 
16/04/03 13:56:06 INFO mapred.MapTask: soft limit at 83886080 
16/04/03 13:56:06 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 
16/04/03 13:56:06 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 
16/04/03 13:56:06 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
16/04/03 13:56:06 INFO mapred.LocalJobRunner: 
16/04/03 13:56:06 INFO mapred.MapTask: Starting flush of map output 
16/04/03 13:56:06 INFO mapred.MapTask: Spilling map output 
16/04/03 13:56:06 INFO mapred.MapTask: bufstart = 0; bufend = 7866; bufvoid = 104857600 
16/04/03 13:56:06 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26212012(104848048); length = 2385/6553600 
16/04/03 13:56:06 INFO mapred.MapTask: Finished spill 0 
16/04/03 13:56:06 INFO mapred.Task: Task:attempt_local954589393_0002_m_000000_0 is done. And is in the process of committing 
16/04/03 13:56:06 INFO mapred.LocalJobRunner: hdfs://localhost:54310/user/hduser/dblp-small/tokens.phase1-000/part-00000:0+12847 
16/04/03 13:56:06 INFO mapred.Task: Task 'attempt_local954589393_0002_m_000000_0' done. 
16/04/03 13:56:06 INFO mapred.LocalJobRunner: Finishing task: attempt_local954589393_0002_m_000000_0 
16/04/03 13:56:06 INFO mapred.LocalJobRunner: map task executor complete. 
16/04/03 13:56:06 INFO mapred.LocalJobRunner: Waiting for reduce tasks 
16/04/03 13:56:06 INFO mapred.LocalJobRunner: Starting task: attempt_local954589393_0002_r_000000_0 
16/04/03 13:56:06 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 
16/04/03 13:56:06 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: [email protected] 
16/04/03 13:56:06 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10 
16/04/03 13:56:06 INFO reduce.EventFetcher: attempt_local954589393_0002_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events 
16/04/03 13:56:06 INFO reduce.LocalFetcher: localfetcher#2 about to shuffle output of map attempt_local954589393_0002_m_000000_0 decomp: 9062 len: 9066 to MEMORY 
16/04/03 13:56:06 INFO reduce.InMemoryMapOutput: Read 9062 bytes from map-output for attempt_local954589393_0002_m_000000_0 
16/04/03 13:56:06 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 9062, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->9062 
16/04/03 13:56:06 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning 
16/04/03 13:56:06 INFO mapred.LocalJobRunner: 1/1 copied. 
16/04/03 13:56:06 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs 
16/04/03 13:56:06 INFO mapred.Merger: Merging 1 sorted segments 
16/04/03 13:56:06 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 9056 bytes 
16/04/03 13:56:06 INFO reduce.MergeManagerImpl: Merged 1 segments, 9062 bytes to disk to satisfy reduce memory limit 
16/04/03 13:56:06 INFO reduce.MergeManagerImpl: Merging 1 files, 9066 bytes from disk 
16/04/03 13:56:06 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce 
16/04/03 13:56:06 INFO mapred.Merger: Merging 1 sorted segments 
16/04/03 13:56:06 INFO mapreduce.Job: Job job_local954589393_0002 running in uber mode : false 
16/04/03 13:56:06 INFO mapreduce.Job: map 100% reduce 0% 
16/04/03 13:56:06 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 9056 bytes 
16/04/03 13:56:06 INFO mapred.LocalJobRunner: 1/1 copied. 
16/04/03 13:56:06 INFO mapred.Task: Task:attempt_local954589393_0002_r_000000_0 is done. And is in the process of committing 
16/04/03 13:56:06 INFO mapred.LocalJobRunner: 1/1 copied. 
16/04/03 13:56:06 INFO mapred.Task: Task attempt_local954589393_0002_r_000000_0 is allowed to commit now 
16/04/03 13:56:06 INFO output.FileOutputCommitter: Saved output of task 'attempt_local954589393_0002_r_000000_0' to hdfs://localhost:54310/user/hduser/dblp-small/tokens-000/_temporary/0/task_local954589393_0002_r_000000 
16/04/03 13:56:06 INFO mapred.LocalJobRunner: reduce > reduce 
16/04/03 13:56:06 INFO mapred.Task: Task 'attempt_local954589393_0002_r_000000_0' done. 
16/04/03 13:56:06 INFO mapred.LocalJobRunner: Finishing task: attempt_local954589393_0002_r_000000_0 
16/04/03 13:56:06 INFO mapred.LocalJobRunner: reduce task executor complete. 
16/04/03 13:56:07 INFO mapreduce.Job: map 100% reduce 100% 
16/04/03 13:56:07 INFO mapreduce.Job: Job job_local954589393_0002 completed successfully 
16/04/03 13:56:07 INFO mapreduce.Job: Counters: 38 
    File System Counters 
     FILE: Number of bytes read=2179300 
     FILE: Number of bytes written=3182466 
     FILE: Number of read operations=0 
     FILE: Number of large read operations=0 
     FILE: Number of write operations=0 
     HDFS: Number of bytes read=99068 
     HDFS: Number of bytes written=31172 
     HDFS: Number of read operations=45 
     HDFS: Number of large read operations=0 
     HDFS: Number of write operations=30 
    Map-Reduce Framework 
     Map input records=597 
     Map output records=597 
     Map output bytes=7866 
     Map output materialized bytes=9066 
     Input split bytes=126 
     Combine input records=0 
     Combine output records=0 
     Reduce input groups=18 
     Reduce shuffle bytes=9066 
     Reduce input records=597 
     Reduce output records=597 
     Spilled Records=1194 
     Shuffled Maps =1 
     Failed Shuffles=0 
     Merged Map outputs=1 
     GC time elapsed (ms)=488 
     CPU time spent (ms)=0 
     Physical memory (bytes) snapshot=0 
     Virtual memory (bytes) snapshot=0 
     Total committed heap usage (bytes)=336207872 
    Shuffle Errors 
     BAD_ID=0 
     CONNECTION=0 
     IO_ERROR=0 
     WRONG_LENGTH=0 
     WRONG_MAP=0 
     WRONG_REDUCE=0 
    File Input Format Counters 
     Bytes Read=12847 
    File Output Format Counters 
     Bytes Written=5478 
Job ended: Sun Apr 03 13:56:07 IST 2016 
The job took 3.563 seconds. 
Multi-Job ended: Sun Apr 03 13:56:07 IST 2016 
The multi-job took 25.128 seconds. 
FuzzyJoinDriver(RIDPairsImproved) 
    Input Path: {hdfs://localhost:54310/user/hduser/dblp-small/records-000} 
    Output Path: hdfs://localhost:54310/user/hduser/dblp-small/ridpairs-000 
    Map Jobs: 2 
    Reduce Jobs: 1 
    Properties: {fuzzyjoin.similarity.name=Jaccard 
       fuzzyjoin.similarity.threshold=.5 
       fuzzyjoin.tokenizer=Word 
       fuzzyjoin.tokens.package=Scalar 
       fuzzyjoin.tokens.lengthstats=false 
       fuzzyjoin.ridpairs.group.class=TokenIdentity 
       fuzzyjoin.ridpairs.group.factor=1 
       fuzzyjoin.data.tokens=dblp-small/tokens-000/part-00000 
       fuzzyjoin.data.joinindex=} 
Job started: Sun Apr 03 13:56:08 IST 2016 
16/04/03 13:56:08 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 
16/04/03 13:56:08 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 
16/04/03 13:56:09 INFO mapred.FileInputFormat: Total input paths to process : 1 
16/04/03 13:56:09 INFO mapreduce.JobSubmitter: number of splits:1 
16/04/03 13:56:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1951342027_0003 
16/04/03 13:56:16 INFO mapred.LocalDistributedCacheManager: Creating symlink: /tmp/mapred/local/1459671970648/part-00000 <- /home/midhu/fuzzyjoin/fuzzyjoin-hadoop/part-00000 
16/04/03 13:56:16 INFO mapred.LocalDistributedCacheManager: Localized hdfs://localhost:54310/user/hduser/dblp-small/tokens-000/part-00000 as file:/tmp/mapred/local/1459671970648/part-00000 
16/04/03 13:56:17 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 
16/04/03 13:56:17 INFO mapreduce.Job: Running job: job_local1951342027_0003 
16/04/03 13:56:17 INFO mapred.LocalJobRunner: OutputCommitter set in config null 
16/04/03 13:56:17 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter 
16/04/03 13:56:17 INFO mapred.LocalJobRunner: Waiting for map tasks 
16/04/03 13:56:17 INFO mapred.LocalJobRunner: Starting task: attempt_local1951342027_0003_m_000000_0 
16/04/03 13:56:17 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 
16/04/03 13:56:17 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/dblp-small/records-000/part-00000:0+36687 
16/04/03 13:56:17 INFO mapred.MapTask: numReduceTasks: 1 
16/04/03 13:56:17 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 
16/04/03 13:56:17 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 
16/04/03 13:56:17 INFO mapred.MapTask: soft limit at 83886080 
16/04/03 13:56:17 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 
16/04/03 13:56:17 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 
16/04/03 13:56:17 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
16/04/03 13:56:17 INFO mapred.LocalJobRunner: map task executor complete. 
16/04/03 13:56:17 WARN mapred.LocalJobRunner: job_local1951342027_0003 
java.lang.Exception: java.lang.RuntimeException: Error in configuring object 
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
Caused by: java.lang.RuntimeException: Error in configuring object 
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) 
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: java.lang.reflect.InvocationTargetException 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
    ... 10 more 
Caused by: java.lang.RuntimeException: Error in configuring object 
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) 
    ... 15 more 
Caused by: java.lang.reflect.InvocationTargetException 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
    ... 18 more 
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: file:/tmp/mapred/local/1459671970648/part-00000 (No such file or directory) 
    at edu.uci.ics.fuzzyjoin.tokenorder.TokenLoad.loadTokenRank(TokenLoad.java:60) 
    at edu.uci.ics.fuzzyjoin.tokenorder.TokenLoad.loadTokenRank(TokenLoad.java:40) 
    at edu.uci.ics.fuzzyjoin.hadoop.ridpairs.token.MapSelfJoin.configure(MapSelfJoin.java:98) 
    ... 23 more 
Caused by: java.io.FileNotFoundException: file:/tmp/mapred/local/1459671970648/part-00000 (No such file or directory) 
    at java.io.FileInputStream.open(Native Method) 
    at java.io.FileInputStream.<init>(FileInputStream.java:146) 
    at java.io.FileInputStream.<init>(FileInputStream.java:101) 
    at edu.uci.ics.fuzzyjoin.tokenorder.TokenLoad.loadTokenRank(TokenLoad.java:45) 
    ... 25 more 
16/04/03 13:56:18 INFO mapreduce.Job: Job job_local1951342027_0003 running in uber mode : false 
16/04/03 13:56:18 INFO mapreduce.Job: map 0% reduce 0% 
16/04/03 13:56:18 INFO mapreduce.Job: Job job_local1951342027_0003 failed with state FAILED due to: NA 
16/04/03 13:56:18 INFO mapreduce.Job: Counters: 0 
java.io.IOException: Job failed! 
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836) 
    at edu.uci.ics.fuzzyjoin.hadoop.FuzzyJoinDriver.run(FuzzyJoinDriver.java:179) 
    at edu.uci.ics.fuzzyjoin.hadoop.ridpairs.RIDPairsImproved.main(RIDPairsImproved.java:108) 
    at edu.uci.ics.fuzzyjoin.hadoop.FuzzyJoin.bib(FuzzyJoin.java:39) 
    at edu.uci.ics.fuzzyjoin.hadoop.FuzzyJoin.main(FuzzyJoin.java:86) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) 
    at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) 
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152) 
    at edu.uci.ics.fuzzyjoin.hadoop.FuzzyJoinDriver.main(FuzzyJoinDriver.java:121) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221) 
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 

ich denke, das ist der Konfigurationsfehler von Hadoop in Ubuntu Ich habe die Konfiguration aus diesem Tutorial verwendet

+0

hat jemand eine Idee –

Antwort

0

Endlich gelang es mir, den Code auszuführen und den Fehler zu korrigieren. Der Fehler wurde aufgrund der lokalen MapReduce-Programm in der Maschine ausgeführt, ich änderte es in Garn laufen und der Code funktioniert gut für alle Arten von Daten

Verwandte Themen