Embedded Design Handbook

ID 683689
Date 8/28/2023
Public
Document Table of Contents

4.2.3.4.4. Sharing Memory With Cache Performance Benefits

Another way to share memory between a data-cache enabled Nios® II processor and other external peripherals safely without sacrificing processor performance is the delayed data-cache flush method. In this method, the Nios® II processor performs operations on memory using standard C or C++ operations until it needs to share this memory with an external peripheral.

Note: Your application can share non-cache-bypassed memory regions with external masters if it runs the alt_dcache_flush() function before it allows the external master to operate on the memory.

To implement delayed data-cache flushing, the application image programs the Nios® II processor to follow these steps:

  1. Processor operates on memory—The Nios® II processor performs reads and writes to a memory region. These reads and writes are C/C++ pointer or array based accesses or accesses to data structures, variables, or a malloc'ed region of memory.
  2. Processor flushes cache—After the Nios® II processor completes the read and write operations, it calls the alt_dcache_flush() instruction with the location and length of the memory region to be flushed. The processor can then signal to the other memory master peripheral to operate on this memory.
  3. Processor operates on memory again—When the other peripheral has completed its operation, the Nios® II processor can operate on the memory once again. Because the data cache was previously flushed, any additional reads or writes update the cache correctly.

The example below shows an implementation of delayed data-cache flushing for memory accesses to a C array of structures. In the example, the Nios® II processor initializes one field of each structure in an array, flushes the data cache, signals to another master that it may use the array, waits for the other master to complete operations on the array, and then sums the values the other master is expected to set.

Data-Cache Flushing With Arrays of Structures

struct input foo[100];

for(i=0;i<100;i++)
	foo[i].input = i;
alt_dcache_flush(&foo, sizeof(struct input)*100);
signal_master(&foo);
for(i=0;i<100;i++)
	sum += foo[i].output;

The example below shows an implementation of delayed data-cache flushing for memory accesses to a memory region the Nios® II processor acquired with malloc().

Data-Cache Flushing With Memory Acquired Using malloc()

char * data = (char*)malloc(sizeof(char) * 1000);

write_operands(data);
alt_dcache_flush(data, sizeof(char) * 1000);
signal_master(data);
result = read_results(data);
free(data);

The alt_dcache_flush_all() function call flushes the entire data cache, but this function is not efficient. Intel recommends that you flush from the cache only the entries for the memory region that you make available to the other master peripheral.