利用Openvino量化及测试YoloV3在英特尔11代酷睿(Tiger Lake)的AI推理性能

小o 更新于 3年前
博客内容章节简介:
  • 软件环境和硬件设备的简要介绍
  • 使用openo自带的downloader进行模型的下载、转换和模块;
  1. 模型转换: pb 模型转换至openvinosupport的xml和bin文件;
  2. 模型:生成FP32的模型文件;
  • 使用openvino 创业工具POT进行模型
  1. FP32 型号 -> INT8 型号
  2. FP16 型号 -> INT8 型号
  • 测试成功模型在CPU、GPU以及CPU+GPU硬件条件下的推理性能;
来源:
  1. openvino官网:https://docs.openvinotoolkit.org/latest/index.html
  2. POT工具介绍:https://docs.openvinotoolkit.org/latest/pot_README.html
  3. 基准数据: https://docs.openvinotoolkit.org/latest/openvino_docs_performance_benchmarks_openvino.html
软件环境和硬件设备的简要介绍
  • 软件环境介绍:
    操作系统:ubuntu18.04、Openvino版本:2021.3、python版本:3.6.9
  • 虎湖硬件介绍:
    CPU:I7-1165G、SSD:256G、RAM:4G DDR4*2
  • Openvino 的安装:
    在本博客内容中不贴出安装教程,具体安装可参考官方网站的安装方法,在后续的故事中,你默认已经在电脑安装 Openvino 2021.3 版本;
    ps:官方安装上步骤链接
使用openvino自带的下载器模块进行yolov3-TF模型的下载,转换和量化(章节该所有的脚本执行路径:均在openvino_2021/deployment_tools/open_model_zoo/tools/downloader下)
  • 操作模型的下载(下载的剧本目录为):
openvino_2021/deployment_tools/open_model_zoo/tools/downloader
  • 使用脚本进行模型的下载:
python3 downloader.py --name yolo-v3-tf

下载过程如图所示:
插入图片描述待模型下载完成后,模型会保存至默认下载路径内(如下图所示);

下载完后模型默认存储路径:openvino_2021/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf

插入图片描述

  • 转换
    使用convert.py脚本进行模型的转换和执行,分别生成FP32和FP16的模型;命令如下:(
    注意:由于博客中下载模型采用的是程序默认路径,因此模型涉及的转换都是使用的默认路径,无需设定其他参数;真实使用自己的路径,需要设定相应参数,具体使用请执行:python3 convert.py --h)
python3 convert.py --name yolo-v3-tf

转换成功后,执行结果日志输出如下:

intel@intel:~/intel/openvino_2021/deployment_tools/open_model_zoo/tools/downloader$ python3 converter.py --name yolo-v3-tf
========== Converting yolo-v3-tf to IR (FP16)
Conversion command: /usr/bin/python3 -- /home/intel/intel/openvino_2021/deployment_tool***odel_optimizer/mo.py --framework=tf --data_type=FP16 --output_dir=/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16 --model_name=yolo-v3-tf '--input_shape=[1,416,416,3]' --input=input_1 '--scale_values=input_1[255]' --reverse_input_channels --transformations_config=/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/yolo-v3.json --input_model=/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/yolo-v3.pb

Model Optimizer arguments:
Common parameters:
	- Path to the Input Model: 	/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/yolo-v3.pb
	- Path for generated IR: 	/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16
	- IR output name: 	yolo-v3-tf
	- Log level: 	ERROR
	- Batch: 	Not specified, inherited from the model
	- Input layers: 	input_1
	- Output layers: 	Not specified, inherited from the model
	- Input shapes: 	[1,416,416,3]
	- Mean values: 	Not specified
	- Scale values: 	input_1[255]
	- Scale factor: 	Not specified
	- Precision of IR: 	FP16
	- Enable fusing: 	True
	- Enable grouped convolutions fusing: 	True
	- Move mean values to preprocess section: 	None
	- Reverse input channels: 	True
TensorFlow specific parameters:
	- Input model in text protobuf format: 	False
	- Path to model dump for TensorBoard: 	None
	- List of shared libraries with TensorFlow custom layers implementation: 	None
	- Update the configuration file with input/output node names: 	None
	- Use configuration file used to generate the model with Object Detection API: 	None
	- Use the config file: 	None
	- Inference Engine found in: 	/home/intel/intel/openvino_2021/python/python3.6/openvino
Inference Engine version: 	2.1.2021.3.0-2787-60059f2c755-releases/2021/3
Model Optimizer version: 	    2021.3.0-2787-60059f2c755-releases/2021/3
[ SUCCESS ] Generated IR version 10 model.
[ SUCCESS ] XML file: /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16/yolo-v3-tf.xml
[ SUCCESS ] BIN file: /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16/yolo-v3-tf.bin
[ SUCCESS ] Total execution time: 31.89 seconds. 
[ SUCCESS ] Memory consumed: 1700 MB. 

========== Converting yolo-v3-tf to IR (FP32)
Conversion command: /usr/bin/python3 -- /home/intel/intel/openvino_2021/deployment_tool***odel_optimizer/mo.py --framework=tf --data_type=FP32 --output_dir=/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP32 --model_name=yolo-v3-tf '--input_shape=[1,416,416,3]' --input=input_1 '--scale_values=input_1[255]' --reverse_input_channels --transformations_config=/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/yolo-v3.json --input_model=/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/yolo-v3.pb

Model Optimizer arguments:
Common parameters:
	- Path to the Input Model: 	/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/yolo-v3.pb
	- Path for generated IR: 	/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP32
	- IR output name: 	yolo-v3-tf
	- Log level: 	ERROR
	- Batch: 	Not specified, inherited from the model
	- Input layers: 	input_1
	- Output layers: 	Not specified, inherited from the model
	- Input shapes: 	[1,416,416,3]
	- Mean values: 	Not specified
	- Scale values: 	input_1[255]
	- Scale factor: 	Not specified
	- Precision of IR: 	FP32
	- Enable fusing: 	True
	- Enable grouped convolutions fusing: 	True
	- Move mean values to preprocess section: 	None
	- Reverse input channels: 	True
TensorFlow specific parameters:
	- Input model in text protobuf format: 	False
	- Path to model dump for TensorBoard: 	None
	- List of shared libraries with TensorFlow custom layers implementation: 	None
	- Update the configuration file with input/output node names: 	None
	- Use configuration file used to generate the model with Object Detection API: 	None
	- Use the config file: 	None
	- Inference Engine found in: 	/home/intel/intel/openvino_2021/python/python3.6/openvino
Inference Engine version: 	2.1.2021.3.0-2787-60059f2c755-releases/2021/3
Model Optimizer version: 	    2021.3.0-2787-60059f2c755-releases/2021/3
[ SUCCESS ] Generated IR version 10 model.
[ SUCCESS ] XML file: /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP32/yolo-v3-tf.xml
[ SUCCESS ] BIN file: /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP32/yolo-v3-tf.bin
[ SUCCESS ] Total execution time: 31.12 seconds. 
[ SUCCESS ] Memory consumed: 1727 MB. 

目录内的文件结构如图所示,FP16 以及 FP32 目录下分别保存的是不同精度的 XML 格式的模型文件:
插入图片描述

使用openvino开源工具POT进行int8模型(脚本:quantizer.py
  • 模型进行int8
  1. 在最新的openvino版本内,有脚本quantizer.py,该脚本能够更容易的对模型进行如下8次的实现,其本身又调用了锅工具进行了操作;剧本
    内容完整:
#!/usr/bin/env python3

# Copyright (c) 2020 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import argparse
import json
import os
import sys
import tempfile

from pathlib import Path

import yaml

import common

DEFAULT_POT_CONFIG_BASE = {
    'compression': {
        'algorithms': [
            {
                'name': 'DefaultQuantization',
                'params': {
                    'preset': 'performance',
                    'stat_subset_size': 300,
                },
            },
        ],
    },
}

DATASET_DEFINITIONS_PATH = common.OMZ_ROOT / 'tools/accuracy_checker/dataset_definitions.yml'


def quantize(reporter, model, precision, args, output_dir, pot_path, pot_env):
    input_precision = common.KNOWN_QUANTIZED_PRECISIONS[precision]

    pot_config_base_path = common.MODEL_ROOT / model.subdirectory / 'quantization.yml'

    try:
        with pot_config_base_path.open('rb') as pot_config_base_file:
            pot_config_base = yaml.safe_load(pot_config_base_file)
    except FileNotFoundError:
        pot_config_base = DEFAULT_POT_CONFIG_BASE

    pot_config_paths = {
        'engine': {
            # "type": "simplified"
            'config': str(common.MODEL_ROOT / model.subdirectory / 'accuracy-check.yml'),
        },
        'model': {
            'model': str(arg***odel_dir / model.subdirectory / input_precision / (model.name + '.xml')),
            'weights': str(arg***odel_dir / model.subdirectory / input_precision / (model.name + '.bin')),
            'model_name': model.name,
        }
    }

    pot_config = {**pot_config_base, **pot_config_paths}

    if args.target_device:
        pot_config['compression']['target_device'] = args.target_device

    reporter.print_section_heading('{}Quantizing {} from {} to {}',
                                   '(DRY RUN) ' if args.dry_run else '', model.name, input_precision, precision)

    model_output_dir = output_dir / model.subdirectory / precision
    pot_config_path = model_output_dir / 'pot-config.json'

    reporter.print('Creating {}...', pot_config_path)
    pot_config_path.parent.mkdir(parents=True, exist_ok=True)
    with pot_config_path.open('w') as pot_config_file:
        json.dump(pot_config, pot_config_file, indent=4)
        pot_config_file.write('\n')

    pot_output_dir = model_output_dir / 'pot-output'
    pot_output_dir.mkdir(parents=True, exist_ok=True)

    pot_cmd = [str(args.python), '--', str(pot_path),
               '--config={}'.format(pot_config_path),
               '--direct-dump',
               '--output-dir={}'.format(pot_output_dir),
               ]

    reporter.print('Quantization command: {}', common.command_string(pot_cmd))
    reporter.print('Quantization environment: {}',
                   ' '.join('{}={}'.format(k, common.quote_arg(v))
                            for k, v in sorted(pot_env.items())))

    success = True

    if not args.dry_run:
        reporter.print(flush=True)

        success = reporter.job_context.subprocess(pot_cmd, env={**os.environ, **pot_env})

    reporter.print()
    if not success: return False

    if not args.dry_run:
        reporter.print('Moving quantized model to {}...', model_output_dir)
        for ext in ['.xml', '.bin']:
            (pot_output_dir / 'optimized' / (model.name + ext)).replace(
                model_output_dir / (model.name + ext))
        reporter.print()

    return True



def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--model_dir', type=Path, metavar='DIR',
                        default=Path.cwd(), help='root of the directory tree with the full precision model files')
    parser.add_argument('--dataset_dir', type=Path, help='root of the dataset directory tree')
    parser.add_argument('-o', '--output_dir', type=Path, metavar='DIR',
                        help='root of the directory tree to place quantized model files into')
    parser.add_argument('--name', metavar='PAT[,PAT...]',
                        help='quantize only models whose name***atch at least one of the specified patterns')
    parser.add_argument('--list', type=Path, metavar='FILE.LST',
                        help='quantize only models whose name***atch at least one of the patterns in the specified file')
    parser.add_argument('--all', action='store_true', help='quantize all available models')
    parser.add_argument('--print_all', action='store_true', help='print all available models')
    parser.add_argument('-p', '--python', type=Path, metavar='PYTHON', default=sys.executable,
                        help='Python executable to run Post-Training Optimization Toolkit with')
    parser.add_argument('--pot', type=Path, help='Post-Training Optimization Toolkit entry point script')
    parser.add_argument('--dry_run', action='store_true',
                        help='print the quantization commands without running them')
    parser.add_argument('--precision*****etavar='PREC[,PREC...]',
                        help='quantize only to the specified precisions')
    parser.add_argument('--target_device', help='target device for the quantized model')
    args = parser.parse_args()

    pot_path = args.pot
    if pot_path is None:
        try:
            pot_path = Path(
                os.environ['INTEL_OPENVINO_DIR']) / 'deployment_tools/tools/post_training_optimization_toolkit/main.py'
        except KeyError:
            sys.exit('Unable to locate Post-Training Optimization Toolkit. '
                     + 'Use --pot or run setupvars.sh/setupvar***at from the OpenVINO toolkit.')

    models = common.load_models_from_args(parser, args)

    # We can't mark it as required, because it's not required when --print_all is specified.
    # So we have to check it manually.
    if not args.dataset_dir:
        sys.exit('--dataset_dir must be specified.')

    if args.precisions is None:
        requested_precisions = common.KNOWN_QUANTIZED_PRECISIONS.keys()
    else:
        requested_precisions = set(args.precisions.split(','))
        unknown_precisions = requested_precisions - common.KNOWN_QUANTIZED_PRECISIONS.keys()
        if unknown_precisions:
            sys.exit('Unknown precisions specified: {}.'.format(', '.join(sorted(unknown_precisions))))

    reporter = common.Reporter(common.DirectOutputContext())

    output_dir = args.output_dir or arg***odel_dir

    failed_models = []

    with tempfile.TemporaryDirectory() as temp_dir:
        annotation_dir = Path(temp_dir) / 'annotations'
        annotation_dir.mkdir()

        pot_env = {
            'ANNOTATIONS_DIR': str(annotation_dir),
            'DATA_DIR': str(args.dataset_dir),
            'DEFINITIONS_FILE': str(DATASET_DEFINITIONS_PATH),
        }

        for model in models:
            if not model.quantizable:
                reporter.print_section_heading('Skipping {} (quantization not supported)', model.name)
                reporter.print()
                continue

            for precision in sorted(requested_precisions):
                if not quantize(reporter, model, precision, args, output_dir, pot_path, pot_env):
                    failed_models.append(model.name)
                    break

    if failed_models:
        reporter.print('FAILED:')
        for failed_model_name in failed_models:
            reporter.print(failed_model_name)
        sys.exit(1)


if __name__ == '__main__':
    main()
  1. 对模型进行了使用实现的部分过程,因此,我们需要模型实现的过程层的能力检查,在本博客中采用了官方演示指定的数据集coco_val_2017数据集以及其标签文件;(如下图)示,val2017文件夹内为照片文件)
    文件结构如下:
    • val2017
      • val2017
      • instance_val17.json.
        插入图片描述数据集以及准备工作后,我们使用quantizer.pyFP3模型来进行模型8的模型工作
        执行如下(指令执行完成后会生成两个文件夹:INT8的、FP16- >INT8的模型):
python3 quantizer.py --name yolo-v3-tf --dataset_dir ~/Desktop/val2017

脚本执行成功的日志输出:

========== Quantizing yolo-v3-tf from FP16 to FP16-INT8
Creating /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16-INT8/pot-config.json...
Quantization command: /usr/bin/python3 -- /home/intel/intel/openvino_2021/deployment_tools/tools/post_training_optimization_toolkit/main.py --config=/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16-INT8/pot-config.json --direct-dump --output-dir=/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16-INT8/pot-output
Quantization environment: ANNOTATIONS_DIR=/tmp/tmp7f1r19ed/annotations DATA_DIR=/home/intel/Desktop/val2017 DEFINITIONS_FILE=/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/accuracy_checker/dataset_definitions.yml

16:51:43 accuracy_checker WARNING: /home/intel/intel/openvino_2021.3.394/deployment_tools/tools/post_training_optimization_toolkit/compression/algorithms/quantization/optimization/algorithm.py:42: UserWarning: Nevergrad package could not be imported. If you are planning to useany hyperparameter optimization algo, consider installing itusing pip. This implies advanced usage of the tool.Note that nevergrad is compatible only with Python 3.6+
  'Nevergrad package could not be imported. If you are planning to use'

INFO:app.run:Output log dir: /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16-INT8/pot-output
INFO:app.run:Creating pipeline:
 Algorithm: DefaultQuantization
 Parameters:
	preset                     : performance
	stat_subset_size           : 300
	target_device              : ANY
	model_type                 : None
	dump_intermediate_model    : False
	exec_log_dir               : /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16-INT8/pot-output
 ===========================================================================
IE version: 2.1.2021.3.0-2787-60059f2c755-releases/2021/3
Loaded CPU plugin version:
    CPU - MKLDNNPlugin: 2.1.2021.3.0-2787-60059f2c755-releases/2021/3
Annotation conversion for ms_coco_detection_80_class_without_background dataset ha***een started
Parameters to be used for conversion:
converter: mscoco_detection
annotation_file: /home/intel/Desktop/val2017/instances_val2017.json
ha***ackground: False
sort_annotations: True
use_full_label_map: False
Total annotations size: 5000
100 / 5000 processed in 0.419s
200 / 5000 processed in 0.424s
300 / 5000 processed in 0.423s
400 / 5000 processed in 0.422s
500 / 5000 processed in 0.436s
600 / 5000 processed in 0.428s
700 / 5000 processed in 0.427s
800 / 5000 processed in 0.427s
900 / 5000 processed in 0.427s
1000 / 5000 processed in 0.440s
1100 / 5000 processed in 0.436s
1200 / 5000 processed in 0.474s
1300 / 5000 processed in 0.438s
1400 / 5000 processed in 0.466s
1500 / 5000 processed in 0.439s
1600 / 5000 processed in 0.457s
1700 / 5000 processed in 0.445s
1800 / 5000 processed in 0.439s
1900 / 5000 processed in 0.446s
2000 / 5000 processed in 0.459s
2100 / 5000 processed in 0.445s
2200 / 5000 processed in 0.451s
2300 / 5000 processed in 0.447s
2400 / 5000 processed in 0.483s
2500 / 5000 processed in 0.619s
2600 / 5000 processed in 0.568s
2700 / 5000 processed in 0.561s
2800 / 5000 processed in 0.465s
2900 / 5000 processed in 0.471s
3000 / 5000 processed in 0.481s
3100 / 5000 processed in 0.473s
3200 / 5000 processed in 0.498s
3300 / 5000 processed in 0.448s
3400 / 5000 processed in 0.453s
3500 / 5000 processed in 0.493s
3600 / 5000 processed in 0.442s
3700 / 5000 processed in 0.496s
3800 / 5000 processed in 0.456s
3900 / 5000 processed in 0.488s
4000 / 5000 processed in 0.507s
4100 / 5000 processed in 0.474s
4200 / 5000 processed in 0.466s
4300 / 5000 processed in 0.434s
4400 / 5000 processed in 0.436s
4500 / 5000 processed in 0.483s
4600 / 5000 processed in 0.428s
4700 / 5000 processed in 0.430s
4800 / 5000 processed in 0.423s
4900 / 5000 processed in 0.430s
5000 / 5000 processed in 0.436s
5000 objects processed in 22.957 seconds
Annotation conversion for ms_coco_detection_80_class_without_background dataset ha***een finished
ms_coco_detection_80_class_without_background dataset metadata will be saved to /tmp/tmp7f1r19ed/annotation***scoco_det_80.json
Converted annotation for ms_coco_detection_80_class_without_background dataset will be saved to /tmp/tmp7f1r19ed/annotation***scoco_det_80.pickle
INFO:compression.statistics.collector:Start computing statistics for algorithms : DefaultQuantization
INFO:compression.statistics.collector:Computing statistics finished
INFO:compression.pipeline.pipeline:Start algorithm: DefaultQuantization
INFO:compression.algorithms.quantization.default.algorithm:Start computing statistics for algorithm : ActivationChannelAlignment
INFO:compression.algorithms.quantization.default.algorithm:Computing statistics finished
INFO:compression.algorithms.quantization.default.algorithm:Start computing statistics for algorithms : MinMaxQuantization,FastBiasCorrection
INFO:compression.algorithms.quantization.default.algorithm:Computing statistics finished
INFO:compression.pipeline.pipeline:Finished: DefaultQuantization
 ===========================================================================

Moving quantized model to /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16-INT8...

========== Quantizing yolo-v3-tf from FP32 to FP32-INT8
Creating /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP32-INT8/pot-config.json...
Quantization command: /usr/bin/python3 -- /home/intel/intel/openvino_2021/deployment_tools/tools/post_training_optimization_toolkit/main.py --config=/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP32-INT8/pot-config.json --direct-dump --output-dir=/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP32-INT8/pot-output
Quantization environment: ANNOTATIONS_DIR=/tmp/tmp7f1r19ed/annotations DATA_DIR=/home/intel/Desktop/val2017 DEFINITIONS_FILE=/home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/accuracy_checker/dataset_definitions.yml

16:56:13 accuracy_checker WARNING: /home/intel/intel/openvino_2021.3.394/deployment_tools/tools/post_training_optimization_toolkit/compression/algorithms/quantization/optimization/algorithm.py:42: UserWarning: Nevergrad package could not be imported. If you are planning to useany hyperparameter optimization algo, consider installing itusing pip. This implies advanced usage of the tool.Note that nevergrad is compatible only with Python 3.6+
  'Nevergrad package could not be imported. If you are planning to use'

INFO:app.run:Output log dir: /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP32-INT8/pot-output
INFO:app.run:Creating pipeline:
 Algorithm: DefaultQuantization
 Parameters:
	preset                     : performance
	stat_subset_size           : 300
	target_device              : ANY
	model_type                 : None
	dump_intermediate_model    : False
	exec_log_dir               : /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP32-INT8/pot-output
 ===========================================================================
IE version: 2.1.2021.3.0-2787-60059f2c755-releases/2021/3
Loaded CPU plugin version:
    CPU - MKLDNNPlugin: 2.1.2021.3.0-2787-60059f2c755-releases/2021/3
Annotation for ms_coco_detection_80_class_without_background dataset will be loaded from /tmp/tmp7f1r19ed/annotation***scoco_det_80.pickle
Loaded dataset info:
	Dataset name: ms_coco_detection_80_class_without_background
	Accuracy Checker version 0.8.6
	Dataset size 5000
	Conversion parameters:
		converter: mscoco_detection
		annotation_file: instances_val2017.json
		ha***ackground: False
		sort_annotations: True
		use_full_label_map: False
ms_coco_detection_80_class_without_background dataset metadata will be loaded from /tmp/tmp7f1r19ed/annotation***scoco_det_80.json
INFO:compression.statistics.collector:Start computing statistics for algorithms : DefaultQuantization
INFO:compression.statistics.collector:Computing statistics finished
INFO:compression.pipeline.pipeline:Start algorithm: DefaultQuantization
INFO:compression.algorithms.quantization.default.algorithm:Start computing statistics for algorithm : ActivationChannelAlignment
INFO:compression.algorithms.quantization.default.algorithm:Computing statistics finished
INFO:compression.algorithms.quantization.default.algorithm:Start computing statistics for algorithms : MinMaxQuantization,FastBiasCorrection
INFO:compression.algorithms.quantization.default.algorithm:Computing statistics finished
INFO:compression.pipeline.pipeline:Finished: DefaultQuantization
 ===========================================================================

Moving quantized model to /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP32-INT8...

ps:如何判断模型成功了;最简单的办法就是查看大小文件(bin文件),查看xml文件,搜索关键字是否有i8或u8类的词;
成功执行后,生成的文件夹内的内容如下所示:
插入图片描述当你文件夹内这些文件时,已经顺利的完成了模型的下载图、工作看到;下一个我们开始进行模型在设备上基准的测试工作了。

模型在CPU、GPU以及CPU+GPU硬件条件下的推理测试
  • 该部分测试采用的是openvino自带的基准进行实验,该脚本存在的路径为:
/openvino_2021/deployment_tools/tool***enchmark_tool
  • 接下来进入该目录下进行模型算法基准的测试(INT8模型)
  1. CPU推理性能的测试
    执行的指令如下所示:
python3 benchmark_app.py -m /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16-INT8/yolo-v3-tf.xml -i /home/intel/Downloads/sample-video***aster/people-detection.mp4 -d CPU

测试结果:

[Step 1/11] Parsing and validating input arguments
[ WARNING ]  -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README. 
[Step 2/11] Loading Inference Engine
[ INFO ] InferenceEngine:
         API version............. 2.1.2021.3.0-2787-60059f2c755-releases/2021/3
[ INFO ] Device info
         CPU
         MKLDNNPlugin............ version 2.1
         Build................... 2021.3.0-2787-60059f2c755-releases/2021/3

[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for CPU device. Although the automatic selection usually provides a reasonable performance,but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading network files
[ INFO ] Read network took 56.83 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 637.67 ms
[Step 8/11] Setting optimal runtime parameters
[Step 9/11] Creating infer requests and filling input blobs with images
[ INFO ] Network input 'input_1' precision U8, dimensions (NCHW): 1 3 416 416
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Infer Request 0 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 1 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 2 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 3 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests using 4 streams for CPU, limits: 60000 ms duration)
[ INFO ] First inference took 107.97 ms
[Step 11/11] Dumping statistics report
Count:      1196 iterations
Duration:   60204.83 ms
Latency:    200.71 ms
Throughput: 19.87 FPS
  1. GPU推理性能的测试
    运行指令:
python3 benchmark_app.py -m /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16-INT8/yolo-v3-tf.xml -i /home/intel/Downloads/sample-video***aster/people-detection.mp4 -d GPU

测试结果:

[Step 1/11] Parsing and validating input arguments
[ WARNING ]  -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README. 
[Step 2/11] Loading Inference Engine
[ INFO ] InferenceEngine:
         API version............. 2.1.2021.3.0-2787-60059f2c755-releases/2021/3
[ INFO ] Device info
         GPU
         clDNNPlugin............. version 2.1
         Build................... 2021.3.0-2787-60059f2c755-releases/2021/3

[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for GPU device. Although the automatic selection usually provides a reasonable performance,but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading network files
[ INFO ] Read network took 52.69 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 778.57 ms
[Step 8/11] Setting optimal runtime parameters
[Step 9/11] Creating infer requests and filling input blobs with images
[ INFO ] Network input 'input_1' precision U8, dimensions (NCHW): 1 3 416 416
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Infer Request 0 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 1 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 2 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 3 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests using 2 streams for GPU, limits: 60000 ms duration)
[ INFO ] First inference took 21.45 ms
[Step 11/11] Dumping statistics report
Count:      3540 iterations
Duration:   60101.89 ms
Latency:    67.88 ms
Throughput: 58.90 FPS
  1. CPU+GPU推理性能的测试
    运行的指令:
python3 benchmark_app.py -m /home/intel/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/public/yolo-v3-tf/FP16-INT8/yolo-v3-tf.xml -i /home/intel/Downloads/sample-video***aster/people-detection.mp4 -d MULTI:CPU,GPU

测试结果:

[Step 1/11] Parsing and validating input arguments
[ WARNING ]  -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README. 
[Step 2/11] Loading Inference Engine
[ INFO ] InferenceEngine:
         API version............. 2.1.2021.3.0-2787-60059f2c755-releases/2021/3
[ INFO ] Device info
         CPU
         MKLDNNPlugin............ version 2.1
         Build................... 2021.3.0-2787-60059f2c755-releases/2021/3
         GPU
         clDNNPlugin............. version 2.1
         Build................... 2021.3.0-2787-60059f2c755-releases/2021/3
         MULTI
         MultiDevicePlugin....... version 2.1
         Build................... 2021.3.0-2787-60059f2c755-releases/2021/3

[Step 3/11] Setting device configuration
[ WARNING ] Turn off threads pinning for CPUdevice since multi-scenario with GPU device is used.
[ WARNING ] -nstreams default value is determined automatically for CPU device. Although the automatic selection usually provides a reasonable performance,but it still may be non-optimal for some cases, for more information look at README.
[ WARNING ] -nstreams default value is determined automatically for GPU device. Although the automatic selection usually provides a reasonable performance,but it still may be non-optimal for some cases, for more information look at README.
[ WARNING ] Turn on GPU trottling. Multi-device execution with the CPU + GPU perform***est with GPU trottling hint, which releases another CPU thread (that is otherwise used by the GPU driver for active polling)
[Step 4/11] Reading network files
[ INFO ] Read network took 52.11 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 62460.94 ms
[Step 8/11] Setting optimal runtime parameters
[Step 9/11] Creating infer requests and filling input blobs with images
[ INFO ] Network input 'input_1' precision U8, dimensions (NCHW): 1 3 416 416
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Infer Request 0 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 1 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 2 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 3 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 4 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 5 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 6 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[ INFO ] Infer Request 7 filling
[ INFO ] Fill input 'input_1' with random values (image is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 8 inference requests using 4 streams for CPU, 2 streams for GPU, limits: 60000 ms duration)
[ INFO ] First inference took 207.95 ms
[Step 11/11] Dumping statistics report
Count:      4616 iterations
Duration:   60165.40 ms
Throughput: 76.72 FPS
  • 小结
  1. 上面的测试只针对了int8模型的推理测试,有兴趣的同学可以按照一样的流程来实现FP16模型以及FP32模型的性能测试
  2. 测试所用的视频为mp4格式,可以从openvino github官网下载;
总结
  • 本博客主要介绍了openvino的使用,包括了模型下载部分、转换、转换了和基准的测试;
  • 农村文有不足,欢迎留言交流,互相学习,进步,谢谢~
  • 希望本博客能对您有所帮助,谢谢您的点赞和关注~~

0个评论