Diewuxi

Belive tomorrow will be better, love science and technology, support communication and understanding, always ready for thought turn.

Blog / engineering_technology / computer / programming / web / Uploaded file storage method in Website

Blog


Article^ Parent

Uploaded file storage method in Website


Date: 2018-08-10 15:00:00
Description: Discuss the method to storage uploaded files in website.
Keywords: Web, Storage
Category: engineering_technology/computer/programming/web
Tag: web, storage
Link: https://www.diewuxi.com/blog/article/61.html

Discuss the method to storage uploaded files in website.

First, consider the structure have equal finite levels, that dont change dynamicly.

The scheme will have these features.

  • Every level has limit not too many items.
  • Scale up.

Limit items

Use sub directories, like multidimension coordinate.

When upload a file, it can put the file in right directory by property related to the file that can derivate multidimension coordinate, and can ensure every directory don't has too many items.

Use increased number

The simplest scheme is use increased integer number as names to create directories, the feature is the directories are created and used one by one, if the directory is reach limit, it will never be put in files in the future, so it convenient for backup by time.

The scheme have following two form:

(1) Directly use increased integer number.

First, decide items in directory which can decide bit of number used, then directory depth. The more the depth is, the longer time to take to fill the directory structure.

Then, if necessary, prefix zeros, use last serval bits which can limit items as identifier in this directory, then skip these bits, from here to left, split resident numbers erery serval bits to construct depth part as directory levels.

When number increase, according above rule, new directory can be created one by one automatically.

(2) Use time string to generate increased numbers.

This form is simlair to above one, except the number of items in deepest directory is not stable, because upload action is not happens at constant rate, so the directory may be empty or with a lot of files.

Use random number

The other scheme is use random number, include hexadecimal numbers.

The random numbers can be generated by random number generater or file's hash code. If the length of the random number is constant, the max number of items in the structure is limit.

Decide items in directory which can decide bit of number used, then directory depth, as above.

Then, files go to target directory according the random number, it is not ordered.

Note that, the random number may repeat someday which result in overwrite previous file.

Scale up

The directory structure must be finite for files, no matter how the design is, so when reach the limit, scale up must be taken. There are two level for scale up, same host and different host.

On same host

As the directory structure is under a parent directory, so if original directory is full, configure program to create another parent directory to contain a another one directory structure, the original directory structure is read-only then and accessed through database's path field, the file there still unique because id is unique in database.

uploads
    \_ 000/
    \_ 001/
    \_ 002/
    ...
    \_ 999/

uploads_001

uploads_003

...

uploads_999
                        

But this is also a finite manner, parent directories also can reach limit, so grandparent directory is needed, and they are also finite. So in the same host, this is a finite manner, this because the limit of OS limit, such as finite path length, finite items in directory.

On different host

This can break OS limit, and be a more extensive way to have more capacity.

Summary

These two levels scale up can be implemente by create "folder" and "host" fileds in database.

--------------------------------------------------------------------------------
id      host    folder      name
----    -----   -------     -------
...     ...     ...         ...
89324   NULL    uploads     es.jpg          http://localhost/uploads/000/000/089/324
...     ...     ...         ...
--------------------------------------------------------------------------------
                        

Example

Use number

Suppose items number is represented by three bits decimal number.

For example:

path:   001/034/985/.../896/345
         ^   ^   ^       ^
level:   1   2   3       n
                        

Relations.

---------------------------------------------
Levels          Total directories
---------       --------------------
1               1,000
2               1,000,000
3               1,000,000,000
4               1,000,000,000,000
--------------------------------------------
                        

Implementation:

// 098000/000/000/001
$id_string = str_pad($this->id, 12, "0", LEFT);
$primaryPath = implode("/", str_split($id_string, 3));
$path = $this->get_root($host) . "/" . Yii::$app->params['fileFolder'] . "/" . $primaryPath;
                        

Use current time

For example:

year/month/day/.../{time_unit}/file.ext


--------------------------------------------------------------------------------
Directory level         Items number
----------------        --------------------------------------------------------
year                    manual limit 1000
month                   12
day                     ~ 30
hour                    24
minute                  60
second                  60
millisecond             1000
microsecond             1000
--------------------------------------------------------------------------------
                        

Suppose deepest directory has 1000 files. Total directories and files handled at different levels.

-------------------------------------------------------------------------------------------------------
Levels      Total directories               Total files                         Capacity s^{-1}
----------  ---------------------------     -------------------------------     -----------------------
1(year)                           1,000                           1,000,000                 ~0.000,32
2(month)                         12,000                          12,000,000                 ~0.000,385
3(day)                         ~360,000                        ~360,000,000                 ~0.011,574
4(hour)                      ~8,640,000                      ~8,640,000,000                 ~0.277,777
5(minute)                  ~518,400,000                    ~518,400,000,000                ~16.666,666
6(second)               ~31,104,000,000                 ~31,104,000,000,000             ~1,000
7(ms)               ~31,104,000,000,000             ~31,104,000,000,000,000         ~1,000,000
8(us)           ~31,104,000,000,000,000         ~31,104,000,000,000,000,000     ~1,000,000,000
-------------------------------------------------------------------------------------------------------
                        

Use hash code

{32 bit hexcimal hash code}

32 bits hexcimal represents $16^{32}$ (~ $3 \times{} 10^{39}$) files.

Every level, use 2 characters, which can represent 256 directories at most. This is medium choice, when 1 character, 16, two few, when 3 characters, 4096, too many.

For example:

ef/ab/7e/.../a3/efab7e...a35ebc2f...

--------------------------------------------------------------------------------
Directory level         Items number
----------------        --------------------------------------------------------
ef                      256
ab                      256
7e                      256
...                     ...
a3                      256
--------------------------------------------------------------------------------
                        

Suppose files put in deepest directory uniformly, every has 1000 files. Total directories and files handled at different levels.

--------------------------------------------------------------------------------
Levels      Total directories             Total files
-------     ------------------      -----------------
1                          256                256,000
2                       65,536             65,536,000
3                   16,777,216         16,777,216,000
4                4,294,967,296      4,294,967,296,000
--------------------------------------------------------------------------------
                        

Last modified: 2018-08-10

Comments [0]

There is no comments now.

Write comment(* is necessary, and email is not shown to public)


Diewuxi 2017--2024