自建数据集系列：从labelImg格式-＞txt格式（YOLO格式 ICDAR格式）

文章目录

前言xml转YOLO格式xml转ICDAR格式🔰 汇总 🔰🔷1.从labelImg格式->txt格式（YOLO格式、ICDAR格式）2.从二值mask-＞labelme格式-＞coco格式3.从labelme格式-＞VOC格式+从二值mask-＞VOC格式4.从RGB-＞二值mask-＞coco格式5.实例分割mask-＞语义分割mask-＞扩增mask6.COCO格式-＞YOLO格式双模图片数据与对应标注文件的命名对齐xml标注文件的节点、属性、文本的修正cocoJson数据集统计分析

前言

xml格式虽然在检测领域是比较常用，但是吧也并非绝对

xml转YOLO格式

xml格式

<?xml version="1.0" ?><annotation><folder>JPEGImages</folder><filename>000000.jpg</filename><path>E:\dataset\camo\JPEGImages\000000.jpg</path><source><database>Unknown</database></source><size><width>500</width><height>282</height><depth>3</depth></size><segmented>0</segmented><object><name>person</name><pose>Unspecified</pose><truncated>0</truncated><difficult>0</difficult><bndbox><xmin>170</xmin><ymin>112</ymin><xmax>223</xmax><ymax>232</ymax></bndbox></object></annotation>

txt格式

0 0.391 0.6063829787234042 0.106 0.425531914893617

# -*- coding: utf-8 -*-import xml.etree.ElementTree as ETimport ossets = ['train', 'val', 'test']# classes = ["a", "b"] # 改成自己的类别classes = ['person'] # class namesabs_path = os.getcwd()def convert(size, box):dw = 1. / (size[0])dh = 1. / (size[1])x = (box[0] + box[1]) / 2.0 - 1y = (box[2] + box[3]) / 2.0 - 1w = box[1] - box[0]h = box[3] - box[2]x = x * dww = w * dwy = y * dhh = h * dhreturn x, y, w, hdef convert_annotation(image_id):in_file = open(abs_path + '/Annotations/%s.xml' % (image_id), encoding='UTF-8')out_file = open(abs_path + '/labels/%s.txt' % (image_id), 'w')tree = ET.parse(in_file)root = tree.getroot()size = root.find('size')w = int(size.find('width').text)h = int(size.find('height').text)for obj in root.iter('object'):difficult = obj.find('difficult').text# difficult = obj.find('Difficult').textcls = obj.find('name').textif cls not in classes or int(difficult) == 1:continuecls_id = classes.index(cls)xmlbox = obj.find('bndbox')b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),float(xmlbox.find('ymax').text))b1, b2, b3, b4 = b# 标注越界修正if b2 > w:b2 = wif b4 > h:b4 = hb = (b1, b2, b3, b4)bb = convert((w, h), b)out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')for image_set in sets:if not os.path.exists(abs_path + '/labels/'):os.makedirs(abs_path + '/labels/')image_ids = open(abs_path + '/ImageSets/Main/%s.txt' % (image_set)).read().strip().split()list_file = open(abs_path + '/%s.txt' % (image_set), 'w')for image_id in image_ids:list_file.write(abs_path + '/JPEGImages/%s.jpg\n' % (image_id))convert_annotation(image_id)list_file.close()

xml转ICDAR格式

ICDAR格式主要用于文字检测识别领域，多为四点框

原始xml：labelImg的标注

<annotation><folder>数据集jpg</folder><filename>81.jpg</filename><path>C:\Users\cam_robot\Desktop\28船\船名字符数据\数据集jpg\81.jpg</path><source><database>Unknown</database></source><size><width>1267</width><height>765</height><depth>3</depth></size><segmented>0</segmented><object><name>44106</name><pose>Unspecified</pose><truncated>0</truncated><difficult>0</difficult><bndbox><xmin>197</xmin><ymin>488</ymin><xmax>312</xmax><ymax>521</ymax></bndbox></object><object><name>CHINA</name><pose>Unspecified</pose><truncated>0</truncated><difficult>0</difficult><bndbox><xmin>606</xmin><ymin>514</ymin><xmax>671</xmax><ymax>541</ymax></bndbox></object><object><name>COAST</name><pose>Unspecified</pose><truncated>0</truncated><difficult>0</difficult><bndbox><xmin>679</xmin><ymin>515</ymin><xmax>747</xmax><ymax>543</ymax></bndbox></object><object><name>GUARD</name><pose>Unspecified</pose><truncated>0</truncated><difficult>0</difficult><bndbox><xmin>757</xmin><ymin>518</ymin><xmax>824</xmax><ymax>546</ymax></bndbox></object></annotation>

目标结果：icdar的txt形式

197.0,488.0,312.0,488.0,312.0,521.0,197.0,521.0,44106606.0,514.0,671.0,514.0,671.0,541.0,606.0,541.0,CHINA679.0,515.0,747.0,515.0,747.0,543.0,679.0,543.0,COAST757.0,518.0,824.0,518.0,824.0,546.0,757.0,546.0,GUARD

转换脚本：xml2txt.py

import xml.etree.ElementTree as ETimport osdef dealXml(xmlPath):tree = ET.parse(xmlPath)root = tree.getroot() #获取根节点，此处是<Annotion>的节点filename = root.find('filename').text #通过find节点再text获取文本time1.jpglists = []for obj in root.findall('object'): #获取所有名为'object'的直接子节点lineList = []for attr in list(obj): #list出Object的所有直接子节点if 'bndbox'in attr.tag: #判断节点标签x1 = float(attr.find('xmin').text) y1 = float(attr.find('ymin').text)x2 = float(attr.find('xmax').text)y2 = float(attr.find('ymin').text)x3 = float(attr.find('xmax').text)y3 = float(attr.find('ymax').text)x4 = float(attr.find('xmin').text)y4 = float(attr.find('ymax').text)lineList = [x1,y1,x2,y2,x3,y3,x4,y4] + lineListif attr.tag=="name":label = attr.textlineList.append(label)lists.append(lineList)return listsdef saveTxt(xmlPath):lists = dealXml(xmlPath)if len(lists)>0:txtPath = os.path.splitext(xmlPath)[0]+".txt"with open(txtPath,mode='w', encoding='UTF-8') as f:for lineList in lists:for item in lineList:if item!= lineList[-1]:f.write(str(item))f.write(",")else:f.write(str(item))f.write("\n")if __name__=='__main__':path = r"C:\Users\cam_robot\Desktop\船名数据集(SEU401)\LabelImgs"for root, dirs, files in os.walk(path, topdown=False):for file in files:portion = os.path.splitext(file)if portion[1]==".xml":saveTxt(os.path.join(path,file))