Tiling slideshow 

Jun-Cheng Chen, Wei-Ta Chu, Jin-Hau Kuo, Chung-Yi Weng, and Ja-Ling Wu
http://www.cmlab.csie.ntu.edu.tw/~wtchu/TilingSlideshow/index.php

Version 1.01 README file (2006/12/04)


** 
This is a very preliminary released version. We exploit several separate 
modules and didn't take care in code optimization. The processing time would 
be out of your expection. Program efficiency will be improved in the future.

If you have any suggestion, please contact Wei-Ta Chu through 
wtchu@cmlab.csie.ntu.edu.tw. 
**

Introduction 
===========================

Tiling slideshow is a kind of new media that provides elaborate photo browsing 
experience. This package automatically performs (1) photo filtering & 
clustering; (2) music beat analysis; and (3) spatial & temporal composition. 
Technical details please refer to [1]. 


License 
===========================

The license for this package is available in the "LICENSE" file. For details, 
refer to the LICENSE file.


Quick Start 
===========================

Step 1: Extract TilingSlideshow_v1.01.rar. Then you get two directories. The 
        directory "TilingSlideshow" includes the main programs, and the 
        directory "VirtualDub" is empty. 
Step 2: Download VirtualDub from http://www.virtualdub.org/ and extract it to 
        the directory "VirtualDub".
Step 3: Edit "photo_filelist.txt" in "TilingSlideshow" directory to indicate 
        where your photos are. 
Step 4: In command line, run 
        "TilingSlideshow.exe photo_filelist.txt your_wav_file parms.txt"
        in which "your_wav_file" should be a path indicating a wav file. 

Installation 
===========================

We take advantage of VirtualDub (http://www.virtualdub.org/) and Xvid 
(http://www.xvid.org/) to perform video encoding. Please install Xvid codec 
first. Download VirtualDub and put it in parallel to the extracted directory of 
"TilingSlideshow." 

In addition to video coding, we also exploit several packages to perform music 
beat analysis and face detection: 

  - Music beat analysis: we exploit the algorithm proposed in [2]. The package
    downloaded from http://sound.media.mit.edu/~eds/beat/tapping.tar.gz has 
    been recompiled by Cygwin and g++ for usage in Microsoft Windows environment. 
    The executable program "tapping.exe" is located in the "beat_detection" 
    directory. Note that the copyright belongs to the original author. 

  - Face detection: we exploit Intel Open Computer Vision Library (OpenCV) to 
    perform face detection. Some necessary descriptions are located in the 
    "face_detection" directory, and some DLLs are copied from OpenCV and are 
    located in the root directory. Note that the copyright belongs to the 
    original authors.


System Requirements 
===========================

Again, this is a very preliminary released version, and we didn't do much in 
code optimization. We suggest you should have at least 512MB RAM and 2.8GHz+ CPU. 
In our environment (2.8GHz CPU and 1G RAM), we need to process about 20 minutes 
for 200 photos and a 4.5-min music. The bottleneck of this process is face detection 
and video encoding. 


Usage 
===========================

Usage: TilingSlideshow.exe photo_filelist wav_file parm_config

(1) photo_filelist: path of the file that stores the path of photos. The default
    directory pathes are stored in "photo_filelist.txt". Note that multiple 
    directories can be assigned. Recursive traverse for sub-directories is also 
    supported after version 1.01.

(2) wav_file: path of the wav file. 

(3) parm_config: path of the parameter file. The default parameters are stored 
    in parms.txt. 


Input 
===========================

(1) Photos: 
    - EXIF metadata: orientation and time information are necessary for correct 
      processing in orientation correction and time-based clustering. 
    - Number of photos: this program is suitable for browsing large amouts of 
      photos. We suggest you prepare at least 200 photos to generate the final 
      result. 
    
(2) Music: 
    - Format: current version only affords mono-channel wav files. Please 
      store your music file as .wav in advance. 


Parameter Settings 
===========================

Parameters are stored in parms.txt. They include: 

(1) QualityFiltering = 0 or 1. 
    - Indicate whether to perform blur and over/underexposure detection and 
      filter out ill-quality photos. The default value is 1.
    
(2) ClusterSep = 0 or 1 or 2.
    - Three profiles can be selected. Larger value indicates more finer 
      clustering is prferred. This parameter influences the average photos 
      displayed at the same frame. The default value is 1. 

(3) AudioInterval = 0 or 1 or 2. 
    - Three profiles can be selected. They indicate three different search 
      ranges in seeking the timing for frame switching. The default value is 0, 
      which indicate r1=4 and r2=6. The parameter influences the rate of frame 
      switching. Larger value indicates lower frame switching rate. 


Output 
===========================

The default output is "slideshow.avi", which will be located in the root 
directory. Xvid codec is used in current version. More selections may be provided 
in the future. 


References 
===========================

[1] J.-C. Chen, W.-T. Chu, J.-H. Kuo, C.-Y. Weng, and J.-L. Wu, "Tiling 
Slideshow," Proceedings of ACM Multimedia Conference, pp. 25-34, 2006. 

[2] E.D. Scheirer, "Tempo and beat analysis of acoustic musical signals." 
Journal of Acoustical Society of America, vol. 103, no. 1, pp. 588-601, 1998.