前言
本文介绍的是利用Python实现的一个小工具,用于分析Git commit log,获得Git Project每个成员的简单行为数据。
Warning:代码量不能代表程序员能力水平!
启动参数
共5个。
- Repo地址
- Commit 起始日期
- Commit 结束日期
- Git仓库子目录
- 统计分析结果CSV文件目标路径
exec_git
Git Log命令:
1
|
git -C {} log --since={} --until={} --pretty=tformat:%ae --shortstat --no-merges -- {} > {} |
填入参数,调用系统命令'os.system()',输出结果至本地临时文件。读取至内存,简单的String Array。
parse
Git Log输出有3种格式,对应3种正则表达式。
1
2
3
|
REPATTERN_FULL = r"\s(\d+)\D+(\d+)\D+(\d+)\D+\n" REPATTERN_INSERT_ONLY = r"\s(\d+)\D+(\d+)\sinsertion\D+\n" REPATTERN_DELETE_ONLY = r"\s(\d+)\D+(\d+)\sdeletion\D+\n" |
遍历得到的数据,首先构造一个以Author为Key,分析结果为Value的字典。
分析结果构造一个元祖,包括:
- Commit 次数
- 增加代码行数
- 删除代码行数
- 变更代码行数
save_csv
简单省略。
示例代码:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
|
#!/usr/local/bin/python3 # -*- coding: utf-8 -*- '''Analyse git branch commit log, for every version, every person.''' import os import sys import re import csv GIT_LOG = r 'git -C {} log --since={} --until={} --pretty=tformat:%ae --shortstat --no-merges -- {} > {}' REPATTERN_FULL = r "\s(\d+)\D+(\d+)\D+(\d+)\D+\n" REPATTERN_INSERT_ONLY = r "\s(\d+)\D+(\d+)\sinsertion\D+\n" REPATTERN_DELETE_ONLY = r "\s(\d+)\D+(\d+)\sdeletion\D+\n" CSV_FILE_HEADER = [ "Author" , "Commit" , "Insert" , "Delete" , "Loc" ] def exec_git(repo, since, until, subdir): '''Execute git log commant, return string array.''' logfile = os.path.join(os.getcwd(), 'gitstats.txt' ) git_log_command = GIT_LOG. format (repo, since, until, subdir, logfile) os.system(git_log_command) lines = None with open (logfile, 'r' , encoding = 'utf-8' ) as logfilehandler: lines = logfilehandler.readlines() return lines def save_csv(stats, csvfile): '''save stats data to csv file.''' with open (csvfile, 'w' , encoding = 'utf-8' ) as csvfilehandler: writer = csv.writer(csvfilehandler) writer.writerow(CSV_FILE_HEADER) for author, stat in stats.items(): writer.writerow([author, stat[ 0 ], stat[ 1 ], stat[ 2 ], stat[ 3 ]]) def parse(lines): '''Analyse git log and sort to csv file.''' prog_full = re. compile (REPATTERN_FULL) prog_insert_only = re. compile (REPATTERN_INSERT_ONLY) prog_delete_only = re. compile (REPATTERN_DELETE_ONLY) stats = {} for i in range ( 0 , len (lines), 3 ): author = lines[i] #empty = lines[i+1] info = lines[i + 2 ] #change = 0 insert, delete = int ( 0 ), int ( 0 ) result = prog_full.search(info) if result: #change = result[0] insert = int (result.group( 2 )) delete = int (result.group( 3 )) else : result = prog_insert_only.search(info) if result: #change = result[0] insert = int (result.group( 2 )) delete = int ( 0 ) else : result = prog_delete_only.search(info) if result: #change = result[0] insert = int ( 0 ) delete = int (result.group( 2 )) else : print ( 'Regular expression fail!' ) return loc = insert - delete stat = stats.get(author) if stat is None : stats[author] = [ 1 , insert, delete, loc] else : stat[ 0 ] + = 1 stat[ 1 ] + = insert stat[ 2 ] + = delete stat[ 3 ] + = loc return stats if __name__ = = "__main__" : print ( 'gitstats begin' ) if len (sys.argv) ! = 6 : print ( 'Invalid argv parameters.' ) exit( 0 ) REPO = os.path.join(os.getcwd(), sys.argv[ 1 ]) SINCE = sys.argv[ 2 ] UNTIL = sys.argv[ 3 ] SUB_DIR = sys.argv[ 4 ] CSV_FILE = os.path.join(os.getcwd(), sys.argv[ 5 ]) LINES = exec_git(REPO, SINCE, UNTIL, SUB_DIR) assert LINES is not None STATS = parse(LINES) save_csv(STATS, CSV_FILE) print ( 'gitstats done' ) |
总结
以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作具有一定的参考学习价值,如果有疑问大家可以留言交流,谢谢大家对服务器之家的支持。
原文链接:http://www.jianshu.com/p/cafc3767fff5