[Sumover-dev] Re: Possibility of merging vic features / multithreadedness

Fri May 29 17:03:39 BST 2009

Hi Andrew,

Sorry I didn't chip in earlier. As Leon mentions both ffmpeg and x264
are designed with [multi]threading in mind. X264 will attempt to auto
detect threading support when ./configure is run and on Linux and OSX
it just picks up the threading support and uses it (the versions
available should be using threading for x264. However ffmpeg doesn't
enable threading as default as it can have detrimental effect on the
quality of certain codecs - I'm not sure if threading is implemented
fully for all codecs - anyway threads can be enabled through a
configure option. You should be able link with the multithreaded
version without a problem. On Windows things are a little more awkward
as you'll need to set up a mingw and Visual C++ environment and
install the pthreadGC library see:
http://ffmpeg.arrozcru.org/wiki/index.php?title=Pthreads
I haven't got around to trying as yet - I'll let you know if I do.

Piers.

2009/5/29 leon zadorin <leonleon77 at gmail.com>:
> On 5/29/09, Andrew Ford <acf0659 at rit.edu> wrote:
>> Hi Leon,
>
> Hi Andrew,
>
> sorry for the following short responses (I am running late for a
> meeting -- keeping busy :-)
>
>> Thanks for the mountain of info :) I agree for the most part re:the issues
>> with multithreading, but the fact is that, after some quick testing, it
>> appears that software encoding of 1080i mpeg4 streams above 10fps is
>> borderline impossible with the (single-threaded) UCL vic.
>
> Not multi-threaded (i.e. single-threaded) != IPC-incapable (i.e. have
> *many* *single* -threaded processes [e.g. daemons] running ala
> 'render' factory scenario which also allows for automatic-
> implied-load-balancing and x-box scalability). I was
> anti-multi-threading, *not* anti-concurrent processing -- those are
> *very* different concepts. I was for serial IO-api, single-threaded
> but IPC-distributed processing model -- as it is all too often that
> naive programmers jump 'too soon' for multi-threaded solutions
> everytime they need concurrent/non-single-threaded processing.
>
> Having said this -- see my previous post re. video-related bandwidth
> (this is *not* by the way related too much to encoding/decoding
> processing but more to do with 'moving'
> uncompressed/decoded/scaled/etc image to-from display (to this degree
> we can see the computers, in general, were never designed as heavy
> IO-boxes, but rather fast calculators -- hence the need for evolved
> GPUs that really exhibit purpose-built computers in their own
> right...)
>
>
>> This necessitates
>> taking advantage of libavcodec's multithreadedness at the very least if we
>
> I agree -- the fact that libavcodec does not offec IPC-concurrency
> (but rather only the multi-threaded version is offered) makes the
> concurrent processing via libavcodec multi-thread dependent (hence UQ
> Vic uses them :-)
>
>
>> vic, but I'm open to suggestions :) ). In addition, decoding 8-10 mpeg4
>> streams gets to be very taxing, so multithreaded decoding also seems
>> necessary if we want any decent amount of scalability, even for standard
>> definition.
>
> Well -- like I previously said -- scalability is not inhibited by
> single-threaded design at all :-) ... for as long as IPC is used.
>
>> Do you know or remember what specific parts of the UQ vic code, presumably
>> where it interfaces with libavcodec, allow multithreaded encoding and
>> decoding?
>
> Don't quite recall -- but my memory tells be that it is a simple
> switch (tell the lib how many threads it can use), so the libavcodec
> will tax the working threads (e.g. from it's own internal pool) and
> then provide a 'fence' paradigm where the decoded portions are
> assembled and returned to calling code.
>
>> Is it necessary for vic to be threaded in order for libavcodec
>> calls to be multithreaded?
>
> AFAICR -- No (but library linking -- i.e. linking with multithreaded
> libs is a common need for any multi-threaded apps -- C++/C coding
> practices 101 :-)
>
>> (Forgive me if that's a dumb question, it's been
>> a couple years since I took Operating Systems 101 :) ) I'm curious if it's
>> possible to quickly implement multithreaded encoding in the fork of UCL
>> mpeg4 vic that supports our HDMI capture cards, just to see if software HD
>> encoding in vic is a viable path to pursue.
>
> Sure -- should be very simple.
>
>> I'd consider porting mpeg4 etc
>> to UQ vic but there are some legal issues involved with the grabber code
>> for
>> the HDMI card which would probably make it difficult.
>
> Grabbing code for Windows in HDMI is done for UQ vic by some NZ people
> I think... but not for linux -- driver things you see :-)
>
> W.r.t. mpeg4 -- adding this to VIC should be almost a zero-effort
> process (once one becomes familiar with the codebase :-)
>
>> Thanks,
>> --Andrew
>
> No probs -- glad to provide my 2 cents -- and once again -- sorry for
> rush response, I am running late and in Sydney this usually spells out
> a catastrophe :-) :-) :-)
>
> leon.
>
>>
>> 2009/5/6 leon zadorin <leonleon77 at gmail.com>
>>
>>> On 5/6/09, Douglas Kosovic <douglask at itee.uq.edu.au> wrote:
>>> > Hi Andrew,
>>> >
>>> > I'm CC'ing Leon the author of UQ Vislab vic, but no longer works at UQ
>>> > Vislab.
>>>
>>> Hi everyone
>>>
>>> I will try to add my $.02 predicated on the fact that I don't have a
>>> recent memory of the current code status of UQ vic (spent last 2 years
>>> or so sadly away from UQ Vislab -- coding different things, but I
>>> don't think that the vic codebase has changed too much anyway)... I
>>> also may be forgetting the things I wrote myself so take my rants with
>>> a grain of salt... and sorry for a rushed response...
>>>
>>> >> I'd like to know if there are any licensing or other obstacles in
>>> merging
>>> >> features from the UQ vic to the mpeg4 vic branch. I know there are a
>>> >> lot
>>> >> to
>>> >> do, but primarily I'm concerned with the Linux DV1394 grabber, as it
>>> seems
>>> >> there's no way currently to take a DV input and transcode it to mpeg4
>>> >> on
>>> >> linux (in the context of AG/vic) as far as I know. I saw mention of a
>>> 1394
>>> >> grabber in the configure script for the mpeg4 branch - I haven't tried
>>> it
>>> >> yet, but how does it compare to the UQ version?
>>>
>>> Licensing -- from what I can remember, a dual license was adopted
>>> w.r.t. to code written for vic in UQ Vislab (feel free to read the
>>> licensing -- it should be included in various files in vic's sources
>>> distro). The basic spirit/thesis is that one can use the code for as
>>> long as one, explicitly and publicly, gives credit to US Vislab for
>>> writing the code/mechanisms which are being used in the target
>>> project/product.
>>>
>>> Mpeg4 branch (UCL version I presume?) and the 1394 grabber therein -- no
>>> idea.
>>>
>>> Firewire grabber in UQ Vislab. From my distant memory:
>>>
>>> 1) uses libraw1394 (i think) to enumerate and list available AV devices;
>>>
>>> 2) uses libiec6183 to actually obtain streamed frames from the
>>> selected device (e.g. camera);
>>>
>>> 3) our grabber is, indeed, multithreaded -- caters for bursty vs
>>> isochronous nature of some firewire cards/drivers vs the frame-rate of
>>> the captured stream and small(ish) buffers in some firewire
>>> controllers;
>>>
>>> 4) the grabbed frames are already encoded (by camera) -- so no need to
>>> encode (if transmitting DV or HDV via MPEG-2 -- basically anything
>>> that libiec61883 supports)... just pass the delimited frames onto
>>> transmission layer (well -- that's the high-level analysis -- there
>>> are code bits that do needed recognition and wrapping -- but those are
>>> minor);
>>>
>>> 5) there are some cross-compatibility to keep in mind w.r.t. TS or PS
>>> streams in MPEG-2 as well as audio vs video streams -- libavformat (if
>>> I recall correctly) does that ok(ish) w.r.t. some formats (indeed --
>>> this is what is being used by our vic).
>>>
>>> 6) all encoding/decoding of new codecs in UQ Vislab vic (i.e. DV,
>>> mpeg2, etc) is done via libavcodec et al. It (libavcodec) does support
>>> multithreaded processing (but it depends on the actual codec in
>>> question) -- our codecs mostly are indeed supported via multithreaded
>>> decoding (this is, partially, why the decoding of HDV streams on
>>> systems running 2 dual-core processes in our labs was very very
>>> efficient)...
>>>
>>> 7) transcoding DV/MPEG2 into mpeg4 is possible and should not be too
>>> difficult (when frame is captured, decode and encode before passing it
>>> to the transmission layer -- a minor 'plug' that's all) -- but why?
>>> Bandwidth? If so -- just move to Tasmania -- it will be 100Mbps soon
>>> :-) On a more serious note -- one may consider the 'transcoding'
>>> module in vic as a codec itself (in the menu, etc.) this would yield a
>>> more elegant (architecture-wise) solution.
>>>
>>> Nowdays, I quite hate multithreading apps (even though I have been
>>> practically raised on multithreading and it was my bread'n'butter for
>>> the majority of my earlier coding years). Well... at least I hate
>>> multithreading in theory :-) [in practice there may be a very minor
>>> number of issues suited for it]
>>>
>>> My theoretical opposition to multithreading is based on things like:
>>>
>>> Single-threaded C/C++ standards (new C++0x does not count -- it is not
>>> as standard yet :-) ;
>>>
>>> Blanket optimization removal by compilers (e.g. lacking
>>> register-promotion across sync blocks, 'opaque function' treatment of
>>> locks etc. -- basically all of the issues listed in publication on why
>>> threads cannot be implemented is a library, but must be embedded into
>>> compiler's optimization awareness/understanding) ;
>>>
>>> Invariable maintenance and evolution difficulties (inclusive of
>>> debugging et al) when compared to single-threaded stuff;
>>>
>>> Lack of scalability across systems (projects of spreading
>>> multithreading across systems don't count :-) :-) :-) ;
>>>
>>> Almost invariable lack of performance (compared to single-threaded
>>> stuff) when number of currently active threads exceeds number of
>>> available cores/cpus (e.g. even thread-switching causes
>>> registers/stack-state save/restore deficit; albeit some registers are
>>> lazy-restored)...
>>>
>>> In my last year or so I have been experimenting with both the
>>> multithreaded and singlethreaded (multiplexed IPC) IO -- without a
>>> doubt what took a week (in terms of adding new features or debugging)
>>> on a single-threaded project was taking 4 times as long on
>>> multithreaded project.
>>>
>>> From a high-level conceptualization standpoint -- multi-threading is
>>> logical, but computers are low-level non-ambiguous beasts and proper
>>> sync (not one big lock so as not to reduce itself to a single-threaded
>>> model, but just enough so as to protect all relevant blocks and at the
>>> same time not to cause dead-locks) -- is not such a simple task esp.
>>> whet seen in a longer-term of re-factoring and feature-evolution of
>>> products...
>>>
>>> ... single-threaded IO was also faster (even in many-to-many streams
>>> mappings) in terms of latency and bandwidth -- but, of course, that
>>> was under medium loading conditions.
>>>
>>> If I could continue development of vic I would consider IPC with
>>> serial IO (i.e. not shared mem)... but that's just a crazy Russian
>>> talking and it would only be a fleeting consideration -- as it may not
>>> suit vic where io-bound solution may manifest itself as a cpu-bound
>>> problem (saturating the cpu's bandwidth et al); and the fact that
>>> libavcodec is multithreaded (not offering IPC via read/write options)
>>> last time I checked it anyway...
>>>
>>> The main issue would be in the communicating of processed image
>>> portions between processes efficiently and without explicit
>>> (user-space, app-code) shared memory API  (to which zero-copy
>>> kernel-based IO assistance would be a worthy research attempt...
>>> perhaps even a model where a single core could share the registers
>>> from other cores would yield higher/wider CPU-bandwidth w/o the need
>>> for user-space multi-threading... or kernel-aware memcpy which would
>>> have pseudo OpenMP style of distribution and barriers across other
>>> cores.. acting like DMA for very large memory regions being copied...)
>>> -- but anyway I digress :-)
>>>
>>> > As for merging UQ Vislab VIC changes back into the UCL SVN repository,
>>> > that's a major project.
>>>
>>> Indeed -- it (UQ vic) not only includes grabbers, but also
>>>
>>> various network transmission mechanisms (fixes as well as improvements),
>>>
>>> gui-related features (*proper* full-screen done by a *window-manager*
>>> [and *not* directly by vic] thus allowing full memory of where the
>>> window was prior to maximisation in terms of it's z-layer, {x,y,w,h}
>>> dims et al -- in fact, if I recall correctly, it was just a simple bug
>>> in Tcl/Tk that stopped vic from allowing full-screen mechanisms
>>> (responding to standard maximization/reparenting et al commands sent
>>> by window manager)  -- after we fixed it (a copy of Tcl/Tk in vic's
>>> source tree) a simple addition of a key-bounding feature to Gnome or
>>> KDE allowed vic to trully fullscreen (kiosk-mode) like any other app
>>> would/should do),
>>>
>>> other improvements include X-video accelerated rendering and so on --
>>> so yeah: porting it would be a very big project indeed...
>>>
>>> ...may be the other way around: porting new UCL features into UQ Vic
>>> would be easier :-) :-) :-) :-) :-)
>>>
>>> Or just look at UQ vic as a source of source :-) We basically
>>> leveraged existing technologies (libiec61883, libavcodec, libavformat,
>>> XVideo, TclTk, token bucket filters and so on) and 'integrated' them
>>> into our application needs -- you can do the same.
>>>
>>> Kind regards
>>> Leon Zadorin.
>>>
>>
>