I noticed you're using 16-bit assembler, similar in syntax to 32-bit.
For the string initialization instruction, STOS it takes an initializer
in the accumulator and a pointer to the string to initialize in ES : DI.
For a 4096 byte array do:
mov cx, 4096 ; initialize 4096 bytes
cld ; Direction UP - to higher addresses
mov es, seg arr ; make ES:DI refer to string
mov di, offset arr
rep stosb ; initialize string
STOSB with DF==0 is equivalent to:
mov [es:di], al
Assuming 4096 fits into DWORDS (which it does), stosd is not available in real-mode
or V8086 mode assuming the poster was intending 16-bit code, (indicated by the use
of DI instead of EDI), 4096 bytes equals 2048 WORDS and the STOSW instruction may
be used to increase performance, there will be a performance increase further
if the array of WORDS lies at an address evenly divisible by two.