在小学做题的时候,如果题目中有相同的数值,我们还可以通过人工比对查找出来。不过现在要用电脑处理的数据很多,我们需要借助某种工具,来帮我们筛选出重复的数据或是文件名。那么学了python的我们有没有什么解决办法呢?今天小编就教大家用re导出文本数据,具体举例如下:
文本内容如下,就是一个编译后的map,我想提取里面的symbol信息,地址,以及在哪些模块里面使用了
当下面有多行时,也就是在多个.o文件中使用时,怎么提取出每一个.o
表达式是:
_([a-zA-Z0-9_]+)\s+([a-z0-9A-Z]{8})\s+defined\s+in\s+[a-zA-Z0-9_]+.o\s+section\s+.+\n\s+used in\s+([a-zA-Z0-9_]+.o)\s*\n\s*(\w+.o)\n\s*(\w+.o)
问题1:
当需要匹配多个“ ******.o”时如何匹配
问题2:
如何把所有满足条件的都匹配出来,
_PfTORQ_r_ThermEffCorrMult 000fe417 defined in torqmall.o section .bss used in torqmctl.o torqmrat.o _PeTORQ_GearState 000fe419 defined in torqmall.o section .bss used in torq_meth_jac.o torq_mulf_jac.o torqmgve.o torqmgvv.o etcdmtps.o _PeTORQ_GearStatePrev 000fe41a defined in torqmall.o section .bss _PeTORQ_GearStateDsrd 000fe41b defined in torqmall.o section .bss _VfTORQ_AXIS_RPM_W_11Brk 000fe41c defined in torqmall.o section .bss used in torqmdes.o tqdrmall.o
解决方法:
re.findall(pattern, string, flags=0)
范例:
>>> text = "He was carefully disguised but captured quickly by police." >>> re.findall(r"\w+ly", text) ['carefully', 'quickly']
测试:
In [1]: yourstr="""_PfTORQ_r_ThermEffCorrMult 000fe417 defined in torqmall.o section .bss used in torqmctl.o torqmrat.o _PeTORQ_GearState 000fe419 defined in torqmall.o section .bss used in torq_meth_jac.o torq_mulf_jac.o torqmgve.o torqmgvv.o etcdmtps.o _PeTORQ_GearStatePrev 000fe41a defined in torqmall.o section .bss _PeTORQ_GearStateDsrd 000fe41b defined in torqmall.o section .bss _VfTORQ_AXIS_RPM_W_11Brk 000fe41c defined in torqmall.o section .bss used in torqmdes.o tqdrmall.o""" In [2]: re.findall('\w+\.o',yourstr) Out[2]: ['torqmall.o', 'torqmctl.o', 'torqmrat.o', 'torqmall.o', 'torq_meth_jac.o', 'torq_mulf_jac.o', 'torqmgve.o', 'torqmgvv.o', 'etcdmtps.o', 'torqmall.o', 'torqmall.o', 'torqmall.o', 'torqmdes.o', 'tqdrmall.o']
看完后小伙伴们会发现,我们想要的.o的结果已经出来啦,说明re.findall针对此类问题的解决有效果哦。
神龙|纯净稳定代理IP免费测试>>>>>>>>天启|企业级代理IP免费测试>>>>>>>>IPIPGO|全球住宅代理IP免费测试