C is purer than C++. It does not have so many obscure features and ambiguous grammar. I understand these facts and thought there would be nothing more to learn about the C language itself. This was almost true in my mind until I met some open-source projects coded in C, i.e., x264 and ffmpeg. In this article, I will not talk about the x264 techniques but only the C language.
A colleague poked me yesterday and asked how to read the array structure below, which was originally found in x264 implementation. I edited it for explanation:
int16_t (*mv[2][2])[2];
For me this kind of presentation of array structure declaration was seen rarely. I paused for a few seconds and recalled a spirial rule I had learnt in college (probably 5 years ago). Back to that time I did not pay much attention to that because I could not understand it due to lack of coding experience. I did not manage to decipher it in a way both of us could understand at first and thus I went through the spirial rule.
So to speak in spirial rule, we may draw it in such way:
+-----------+
| +---+ |
| ^ | |
int16_t (*mv[2][2])[2];
^ ^ | |
| +-----+ |
+-------------------+
In speaking, it could be explained in the following English statement:
mv
is a 2x2 2D array of pointers toint16_t[2]
.
It may still unclear to understand. I extend it in this way:
mv
is a 2x2 2D array. Each of the array element is a pointer. Each pointer is pointing to oneint16_t[2]
element.
For now I think it will not be that wired to see why x264 accesses mv
in, for example mv[0][1][6376][1]
, patterns.
In debugging we found that x264 sometimes use negative indexes in an array. e.g.:
int t = some_random_int_array[-1];
it is like why the hell can indexes are negative? However it turns out to be totally legal and not like python, negative indexes indicate elements before the first element of the array. This is because the pattern array[idx]
is equivalent to *(array + idx)
. This SO thread explains and quotes the following from C99 §6.5.2.1/2:
The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))).
The story of learning new facts wen on and then I met designated initializers but I do not want to repeat every details of the specification here. As an short example in ffmpeg, I saw this:
AVCodec ff_libx264_encoder = {
.name = "libx264",
.long_name = NULL_IF_CONFIG_SMALL("libx264 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10"),
.type = AVMEDIA_TYPE_VIDEO,
.id = AV_CODEC_ID_H264,
.priv_data_size = sizeof(X264Context),
.init = X264_init,
.encode2 = X264_frame,
.close = X264_close,
.capabilities = CODEC_CAP_DELAY | CODEC_CAP_AUTO_THREADS,
.priv_class = &x264_class,
.defaults = x264_defaults,
.init_static_data = X264_init_static,
};
I can guess what is the dot variable name is about, but did not ever imagine C can do something like this!
These all kinds of both new/old facts refreshed my attitude towards C. I knew C++ is a language hard to master all the details, but I have always underestimated C as well. Language is evolving itself all the time even for C.
这两天做了一个Cascade Classifier人脸检测的项目,放到了Github上。
主要功能有:
实现起来很直接,不过有几点挺有意思的,比较值得注意一下:
####xml2header.cmake
cascade文件预读进内存的思想是用项目里的xml2header.cmake
脚本处理cascade的xml
文件,生成一个含有长字符串的.h
头文件,然后.cpp
文件引用它。
这里有个问题就是,有的xml文件很大,比如常用的haarcascade_frontalface_alt.xml
。这个文件如果直接编译成一个静态的长字符串,编译器很可能会出错。因此,我在cmake脚本里对这个文件切割成几个小的std::string
,然后在程序初始化时用std::accumulate
函数再组成完整的cascade字符串。另外要注意,读取成字符串的时候要把文件中的\
和\\
转化成\\
和\\\\
,每一行结尾要再加一个\n
。
####读取视频源
测试中我使用了两种I420视频源,一种是有header的.y4m
格式,一种是没有header的.yuv
格式文件。对于.y4m
,我们可以参考网上对于y4m格式的介绍来逐帧读取。
####从内存中读取cascade字符串
处理cascade字符串时,我们可以用FileStorage
创建一个流,然后给OpenCV的cv::CascadeClassifier
类的read
使用。不过实现过程中我发现read
函数只支持新的Cascade文件 - 通过traincascade
训练而来的,参考OpenCV API的文档- 为了绕过这一点,我重写了load
函数的其中一小部分,这样老的cascade文件也能从内存里读取了。
最近一段时间需要在Ubuntu上做项目。为了方便开发,使用了Eclipse的C++插件来帮助调试。可是日常使用时经常遇到一个很麻烦的问题,Eclipse的调试器(也就是gdb)对C++的STL库的支持很差。比如我想查看一个std::vector
的内容,用Visual Studio的调试器可以很方便的看到这个容器的大小和每个元素的值,微软甚至提供给用户自定义调试器显示容器内容的方法;不过,默认情况下,Eclipse/gdb就会显示下面这一陀对调试用处不大的东西:
bar {...}
std::_Vector_base<TSample<MyTraits>, std::allocator<TSample<MyTraits> > >
_M_impl {...}
std::allocator<TSample<MyTraits> > {...}
_M_start 0x00007ffff7fb5010
_M_finish 0x00007ffff7fd4410
_M_end_of_storage 0x00007ffff7fd5010
于是乎xp在SO上找到了个解决方案。这里要借助一个叫做_Python libstdc++ printers_的插件来实现美化功能。
$> sudo apt-get install python2.7
$> sudo apt-get install gdb python2.7-dbg
$> mkdir ~/python_printer
$> cd ~/python_printer
$> svn co svn://gcc.gnu.org/svn/gcc/trunk/libstdc++-v3/python
~/.gdbinit
,如果没有就创建一个。这个以我的为例:python
import sys
sys.path.insert(0, '/home/pengx17/python_printer/python')
from libstdcxx.v6.printers import register_libstdcxx_printers
register_libstdcxx_printers (None)
end
修改
Run->Debug Configurations...->Debugger
的GDB command file
为/home/pengx17/.gdbinit
前两天遇到一个挺有意思的问题:
已知有一个
class A
的实例,A
有一个函数func
,但不知道A
的具体声明和定义。 如果我们有一个A
的NULL
指针A *a = NULL
,如果调用a->func()
的话,可能会出现什么情况呢?
先不管调用空指针是否是未定义行为。我们从C++语言本身角度去考虑,这样调用是有可能不抛出异常的。
我总结了几个不同的情况,如下(Visual Studio 2012):
#include <iostream>
#include <Windows.h>
#include <exception>
using namespace std;
class A
{
public:
void func()
{
cout << "wtf?" << endl;
}
void func_this()
{
cout << "wtf: " << this->data << endl;
}
static void func_static()
{
cout << "static wtf?" << endl;
}
virtual void func_virtual()
{
cout << "virtual wtf?" << endl;
}
A():data(0){}
int data;
};
int main()
{
A *a = NULL;
a->func();
__try
{
a->func_this();
}
__except(EXCEPTION_EXECUTE_HANDLER)
{
cout << "cannot invoke func_this" << endl;
}
a->func_static();
__try
{
a->func_virtual();
}
__except(EXCEPTION_EXECUTE_HANDLER)
{
cout << "cannot invoke func_virtual" << endl;
}
return 0;
}
命令行输出结果为:
wtf?
cannot invoke func_this
static wtf?
cannot invoke func_virtual
####分析
我们来依次分析一下能正常运行的func()
和func_static()
:
func()
函数时A
指针不是必须的。在编译时,A
类型已知,func()
函数指针已经可以确认了。func_static()
也不需要实际的实例对象。对于抛出异常的func_this()
和func_virtual()
:
func_this()
用到了this
指针,而this
在这样的情况下是NULL
,所以会抛出异常。func_virtual()
时,我们需要一个可用的虚函数表(vtable
)指针,但显然这个指针是拿不到的,因此抛出异常。不过,实际开发中要尽量避免这种情况哟。
####参考