Article^ Parent
Uploaded file storage method in Website
Date: | 2018-08-10 15:00:00 |
Description: | Discuss the method to storage uploaded files in website. |
Keywords: | Web, Storage |
Category: | engineering_technology/computer/programming/web |
Tag: | web, storage |
Link: | https://www.diewuxi.com/blog/article/61.html |
Discuss the method to storage uploaded files in website.
First, consider the structure have equal finite levels, that dont change dynamicly.
The scheme will have these features.
- Every level has limit not too many items.
- Scale up.
Limit items
Use sub directories, like multidimension coordinate.
When upload a file, it can put the file in right directory by property related to the file that can derivate multidimension coordinate, and can ensure every directory don't has too many items.
Use increased number
The simplest scheme is use increased integer number as names to create directories, the feature is the directories are created and used one by one, if the directory is reach limit, it will never be put in files in the future, so it convenient for backup by time.
The scheme have following two form:
(1) Directly use increased integer number.
First, decide items in directory which can decide bit of number used, then directory depth. The more the depth is, the longer time to take to fill the directory structure.
Then, if necessary, prefix zeros, use last serval bits which can limit items as identifier in this directory, then skip these bits, from here to left, split resident numbers erery serval bits to construct depth part as directory levels.
When number increase, according above rule, new directory can be created one by one automatically.
(2) Use time string to generate increased numbers.
This form is simlair to above one, except the number of items in deepest directory is not stable, because upload action is not happens at constant rate, so the directory may be empty or with a lot of files.
Use random number
The other scheme is use random number, include hexadecimal numbers.
The random numbers can be generated by random number generater or file's hash code. If the length of the random number is constant, the max number of items in the structure is limit.
Decide items in directory which can decide bit of number used, then directory depth, as above.
Then, files go to target directory according the random number, it is not ordered.
Note that, the random number may repeat someday which result in overwrite previous file.
Scale up
The directory structure must be finite for files, no matter how the design is, so when reach the limit, scale up must be taken. There are two level for scale up, same host and different host.
On same host
As the directory structure is under a parent directory, so if original directory is full, configure program to create another parent directory to contain a another one directory structure, the original directory structure is read-only then and accessed through database's path field, the file there still unique because id is unique in database.
uploads
\_ 000/
\_ 001/
\_ 002/
...
\_ 999/
uploads_001
uploads_003
...
uploads_999
But this is also a finite manner, parent directories also can reach limit, so grandparent directory is needed, and they are also finite. So in the same host, this is a finite manner, this because the limit of OS limit, such as finite path length, finite items in directory.
On different host
This can break OS limit, and be a more extensive way to have more capacity.
Summary
These two levels scale up can be implemente by create "folder" and "host" fileds in database.
--------------------------------------------------------------------------------
id host folder name
---- ----- ------- -------
... ... ... ...
89324 NULL uploads es.jpg http://localhost/uploads/000/000/089/324
... ... ... ...
--------------------------------------------------------------------------------
Example
Use number
Suppose items number is represented by three bits decimal number.
For example:
path: 001/034/985/.../896/345
^ ^ ^ ^
level: 1 2 3 n
Relations.
---------------------------------------------
Levels Total directories
--------- --------------------
1 1,000
2 1,000,000
3 1,000,000,000
4 1,000,000,000,000
--------------------------------------------
Implementation:
// 098000/000/000/001
$id_string = str_pad($this->id, 12, "0", LEFT);
$primaryPath = implode("/", str_split($id_string, 3));
$path = $this->get_root($host) . "/" . Yii::$app->params['fileFolder'] . "/" . $primaryPath;
Use current time
For example:
year/month/day/.../{time_unit}/file.ext
--------------------------------------------------------------------------------
Directory level Items number
---------------- --------------------------------------------------------
year manual limit 1000
month 12
day ~ 30
hour 24
minute 60
second 60
millisecond 1000
microsecond 1000
--------------------------------------------------------------------------------
Suppose deepest directory has 1000 files. Total directories and files handled at different levels.
-------------------------------------------------------------------------------------------------------
Levels Total directories Total files Capacity s^{-1}
---------- --------------------------- ------------------------------- -----------------------
1(year) 1,000 1,000,000 ~0.000,32
2(month) 12,000 12,000,000 ~0.000,385
3(day) ~360,000 ~360,000,000 ~0.011,574
4(hour) ~8,640,000 ~8,640,000,000 ~0.277,777
5(minute) ~518,400,000 ~518,400,000,000 ~16.666,666
6(second) ~31,104,000,000 ~31,104,000,000,000 ~1,000
7(ms) ~31,104,000,000,000 ~31,104,000,000,000,000 ~1,000,000
8(us) ~31,104,000,000,000,000 ~31,104,000,000,000,000,000 ~1,000,000,000
-------------------------------------------------------------------------------------------------------
Use hash code
{32 bit hexcimal hash code}
32 bits hexcimal represents $16^{32}$ (~ $3 \times{} 10^{39}$) files.
Every level, use 2 characters, which can represent 256 directories at most. This is medium choice, when 1 character, 16, two few, when 3 characters, 4096, too many.
For example:
ef/ab/7e/.../a3/efab7e...a35ebc2f...
--------------------------------------------------------------------------------
Directory level Items number
---------------- --------------------------------------------------------
ef 256
ab 256
7e 256
... ...
a3 256
--------------------------------------------------------------------------------
Suppose files put in deepest directory uniformly, every has 1000 files. Total directories and files handled at different levels.
--------------------------------------------------------------------------------
Levels Total directories Total files
------- ------------------ -----------------
1 256 256,000
2 65,536 65,536,000
3 16,777,216 16,777,216,000
4 4,294,967,296 4,294,967,296,000
--------------------------------------------------------------------------------
Last modified: 2018-08-10
Comments [0]
There is no comments now.